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1  Center  Objectives  and  the  Status  of  the  Effort 

The  MURI  Center  on  Modeling  and  Control  of  Plasma  Processing  at  the  University  of 
Michigan  started  in  September,  1995,  and  concluded  technical  work  at  the  end  of  August, 
2001.  As  the  name  indicates,  the  major  research  goals  of  the  center  are  in  the  areas  of 
modeling  and  control  of  plasma  deposition  and  etching  processes.  These  plasma 
processes  are  used  extensively  in  the  manufacture  of  integrated  circuits  as  well  as  active 
matrix  liquid  crystal  displays.  These  applications  areas  motivate  our  selection  of  research 
problems  in  modeling  and  control. 

The  major  goal  of  this  MURI  Center  was  to  develop  basic  science  and  technology  to 
enable  significant  improvements  in  the  robustness  and  performance  of  plasma  etching 
and  deposition  used  in  microelectronics  manufacturing.  The  research  strategy  was  to  use 
a  synergistic  combination  of  modeling,  sensors,  and  control.  We  were  also  focused  on  the 
ultimate  manufacturing  applications,  which  will  impact  cost  and  quality  of  the 
microelectronics  products.  In  turn  this  will  be  of  benefit  to  the  Department  of  Defense. 

Over  the  contract  period,  we  focused  on  the  following  main  topics: 

1 .  Modeling  -  first  principles  plasma  modeling  and  statistical  process  modeling 

2.  Plasma  sensors  -  RF  sensing,  optical  emission  sensing,  CF2  sensing 

3.  Wafer  sensors:  Modeling,  Signal  Processing,  and  Control 

4.  Real-time  control  using  both  process  and  wafer  sensors 

5.  Process  and  materials  research 

Significant  accomplishments  were  made  in  all  of  these  areas  (as  will  be  discussed  in  the 
body  of  this  report).  Particular  program  highlights  include: 

1 .  An  optical  technique  was  developed  to  monitor  in  situ  and  in  real-time  the  critical 
dimensions  and  wall-shapes  of  evolving  features  in  reactive  ion  etchers.  An 
advanced  signal  processing  scheme  was  devised  to  use  this  technique  to  perform 
the  first  fully-automated  etch-to-target-dimension  etches.  One-nanometer-level 
(or  better)  accuracy  was  demonstrated  enabling  possibilities  for  extremely  high 
accuracy  semiconductor  fabrication  control. 

2.  The  state-of-the-art  of  lst-principles  plasma  equipment  modeling  was  advanced  so 
that  the  entire  system  of  the  sensors,  plasma  process  equipment,  and  control 
systems  could  be  modeled  numerically. 

3.  Novel  RF  Sensing  to  non-invasively  measure  the  electrical  state  of  plasma 
systems  was  developed  and  applications  to  detecting  common  faults  were 
demonstrated. 

4.  Improved  statistical  methods  for  detecting  and  identifying  the  causes  of  spatially 
clustered  defects  in  semiconductor  manufacturing 

5.  Development  of  a  novel  ion-beam  modification  process  for  the  deposition  of  A1 
films  which  are  more  resistant  to  grain-growth. 

6.  Development  of  improved  plasma  deposition  processes  for  the  manufacture  of 
high  performance  AMLCDs. 

The  most  striking  of  these  results  is  item  1 ,  the  development  of  both  physical  sensing 
methods  to  accurately  extract  deep  sub-micron  topographic  information  in  situ  and  in 
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real-time,  and  the  control/signal  processing  methodology  to  make  use  of  this  information 
for  very  high  accuracy  etch  process  control.  This  set  of  results  has  attracted  a  great  deal 
of  attention  in  both  scientific  and  industrial  communities.  Frankly,  this  level  of  success 
was  well  beyond  what  we  had  anticipated  at  the  beginning  of  this  project.  It  was  the 
directly  the  result  of  the  ability  offered  by  this  MURI  to  combine  the  efforts  of 
researchers  in  multiple  separate  disciplines  to  focus  on  a  very  significant  problem. 

Some  aspects  of  the  research  successes  of  this  program,  including  the  in  situ  topography 
measurement  and  control  were  pushed  further  under  a  NIST-ATP  program,  “Intelligent 
Control  of  the  Semiconductor  Patterning  Process,”  (cooperative  agreement  No: 
70NANB8H4067).  Under  this  ATP  effort  several  result  of  the  research  of  this  MURI 
program  were  extended  and  pushed  closer  to  industrial  application: 

•  The  Broadband  RF  sensor  (section  3.1)  was  extended  to  full  real-time 
feedback  control  of  plasma  density  and  very  significant  stabilization  of  CI2 
etching  of  Si  was  achieved. 

•  The  real-time  optical  topography  methods  were  further  developed  and 
demonstrated  to  industry.  While  the  industry  is  not  yet  ready  for  the  in  situ, 
real-time  application,  these  successful  demonstration  have  helped  speed  the 
application  of  the  still  useful  but  more  conservative  in-line,  wafer-to-wafer 
control  approach. 

•  Industrial  applications  of  combined  feed-back/feed-forward  control  methods 
were  made  involving  University  of  Michigan  developed  algorithms  being 
implemented  at  Lam  Research  Corp.  using  KLA-Tencor  in-line  metrology 
tools  on  Motorola  wafers. 

In  summary,  we  believe  that  this  MURI  program  achieved  major  technical  successes  that 
were  the  direct  result  of  synergistic  activities  among  researchers  who  otherwise  would 
not  have  had  the  resources  and  freedom  to  work  together  on  these  problems.  Some  of 
these  successes  have  already  been  internationally  recognized,  and  industrial  applications 
are  already  being  made.  The  technology  transfer  was  facilitated  by  the  ATP  program 
mentioned  above,  but  there  was  also  technology  transfer  directly  through  our  published 
papers.  For  instance,  we  learned  at  a  major  symposium  on  process  control  that  an 
engineer  at  Micron  Technologies  had  read  our  papers  on  extended  Kalman  filtering  for 
thin  film  thickness  monitoring  and  had  directly  applied  them  to  his  production 
development  problems.  While  it  difficult  or  impossible  to  document  this  impression,  we 
believe  that  the  successes  of  this  program  (and  those  of  the  other  MURI  efforts  in  this 
series)  significantly  helped  with  the  semiconductor  industry’s  move  toward  real  use  of 
advanced  process  control. 
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2  Plasma  Process  First  Principles  Modeling 

The  University  of  Illinois  (UI)  research  tasks  were  to  develop  plasma  equipment  models 
as  vehicles  to  both  test  control  strategies  for  plasma  etching  and  deposition  tools,  and  to 
generate  system  performances  and  responses  for  parameters  which  might  otherwise  be 
difficult  to  experimentally  provide.  The  motivation  for  this  work  is  that  if  one  can 
develop  a  sufficiently  comprehensive  computational  representation  of  the  plasma 
equipment,  realistic  sensor  inputs  can  be  provided  for  a  controller;  and  the  control 
directives  can  be  implemented  on  a  “virtual  basis”.  In  doing  so  one  can  speed  the 
development  time  for  new  control  methodologies  and  do  so  with  more  physical 
understanding  of  the  process. 

The  foundation  for  our  modeling  activities  is  the  Hybrid  Plasma  Equipment  Model 
(HPEM).  The  HPEM  is  a  comprehensive  plasma  equipment  simulator  which  is  able  to 
address  a  wide  variety  of  tools,  including  reactive  ion  etching  (RIE)  and  inductively 
coupled  plasma  (ICP)  sources.  In  a  subset  of  our  MURI  research  tasks,  we  are 
continuing  technical  development  of  the  HPEM  to  improve  its  capabilities.  During  the 
MURI  we  have  developed  an  interface  to  the  HPEM  which  provides  virtual  sensor  inputs 
and  actuator  directives  to  the  HPEM  from  a  controller  module  (CM).  We  call  this 
framework  the  Virtual  Plasma  Equipment  Model  (VPEM).  (See  Figure  1)  Conceptually, 
we  treat  the  VPEM  exactly  as  one  would  treat  an  experimental  plasma  tool  equipped  with 
sensors  and  actuators.  We  specify  operating  conditions,  sensor  inputs  and  control  points, 
and  a  desired  mode  of  operation.  We  then  perturb  the  system.  The  CM  takes  the  sensor 
inputs,  implements  a  control  strategy,  and  specifies  actuator  directives  which  bring  the 
plasma  tool  back  onto  desired  operating  specifications.  Using  the  VPEM,  a  number  of 
scenarios  have  been  investigated  in  which  response  surface  and  PID  based  controllers 
were  used  to  compensate  for  external  disturbances,  nullify  the  effect  of  long  term  drifts 
and  improve  uniformity. 


Figure  1  Schematic  of  the  Virtual  Plasma  Equipment  Model  (VPEM). 
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As  an  example  of  these  capabilities,  we  demonstrate  the  use  of  the  VPEM  to 
investigate  pulsed  plasma  processing  and,  in  particular,  controlling  transients  during,  for 
example,  recipe  changes.  We  highlight  one  study  in  which  we  investigated  the 
appropriateness  of  using  actinometry  sensors  during  a  transient  where  the  wall  sticking 
coefficient  of  Cl  atoms  changes  due  to,  for  example,  a  recipe  change.  The  densities  of 
Ar,  Cl  and  CI2  in  an  inductively  coupled  plasma  during  a  transient  where  the  wall 
reassociation  coefficient  of  Cl  atoms  increases  by  a  factor  of  4  are  shown  in  Fig.  2.  The 
density  of  Cl  decreases,  while  those  for  Ar  and  CI2  increase.  The  Ar  density  increases 
because  the  total  flowrate  has  decreased  in  order  to  keep  the  pressure  and  mass  flux 
constant  while  the  Ar  fractionl  flow  rate  remains  constant. 


Figure  2  Densities  in  an  ICP  reactor  during  a  transient  during  which  the  wall  recombination  rate 
of  Cl  atoms  increases. 


Densities  in  an  ICP  reactor  during  a  transient  during  which  the  wall  recombination  rate  of 
Cl  atoms  increases. 

An  actinometry  sensor  was  used  to  maintain  the  density  of  Cl  atoms  constant  using  a 
PID  controller.  The  actinometry  signal  is  the  ratio  of  optical  emission  Cl*/Ar*.  The 
actinometry  signal  closely  follows  the  Cl  density  prior  to  the  transient,  as  shown  in 
Figure  3.  After  the  transient  the  actinometry  signal  and  Cl  density  (in  the  absence  of 
control)  diverge.  This  occurs  because,  while  keeping  the  pressure  and  mass  flux 
constant,  the  mole  fraction  of  Ar  in  the  system  increases.  As  a  result,  the  denominator  in 
the  actinometry  sensor  increases,  making  the  signal  decrease  relative  to  the  Cl  density. 
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Figure  3  Actinometry  signal  and  actual  Cl  density  before  and  during  the  transient. 


When  one  then  tries  to  control  this  transient  using  a  PID  controller  (which  equates 
Cl*  signal  to  power  deposition),  the  response  is  to  recommend  an  increase  in  power.  The 
increase  in  power  produces  an  increase  in  gas  temperature  and  a  commensurate  decrease 
in  gas  densities  which  further  "motivates"  an  increase  in  power.  (See  Figure  4)  The  end 
result  is  that  the  system  goes  unstable. 


It  is  important,  then,  to  choose  sensors  and  actuators  which  are  well  correlated  in  well 
behaved  manner  over  large  dynamics  ranges.  The  choice  of  such  sensors  is  often  not 
immediately  obvious.  Through  the  use  of  the  VPEM,  such  sensors  can  be  selected.  For 
example,  we  revisit  the  control  problem  in  a  chlorine  inductively  coupled  plasma  where 
the  wall  sticking  coefficient  for  Cl  — »  CI2  on  the  walls  impulsively  increases.  This 
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change  in  sticking  coefficient  not  only  changes  the  magnitude  of  fluxes  to  the  substrate 
but  also  their  uniformity.  For  example,  the  densities  of  Cl  and  Cl2  in  an  inductively 
coupled  plasma  with  the  operating  conditions  Ar/Cl2  =  50/50,  20  mTorr,  500  W,  200 
seem  are  shown  in  Figure  5.  The  corresponding  changes  in  ion  flux  and  uniformity  are 
shown  in  Figure  6. 


Cl  >  Cfe  :  0.05  >  0.4 
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Figure  5  Densities  of  Cl  and  Cl2  in  an  ICP  reactor  following  a  change  in  sticking  coefficient. 


Figure  6  Ion  flux  and  uniformity  of  the  ion  flux  resulting  form  a  change  in  wall  sticking 
coefficient. 


Through  parameterization  of  the  VPEM,  we  found  that  ion  flux  was  well  controlled 
by  ICP  power  and  ion  flux  uniformity  could  be  controlled  by  use  of  a  static  magnetic 
field.  The  control  of  ion  flux  by  power  deposition  is  straightforward.  The  control  of  ion 
flux  uniformity  by  use  of  a  magnetic  field  results  from  a  change  in  the  spatial  distribution 
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of  power  deposition.  In  short,  the  plasma  tensor  conductivity  in  the  presence  of  a  static 
magnetic  field  “pushes”  power  deposition  around  by  generating  additional  components  of 
the  inductively  coupled  electric  field.  These  trends  are  demonstrated  in  Figure  7  where 
power  deposition  with  and  without  a  5  G  magnetic  field  is  shown. 

Using  2  PID  controllers,  correlating  power  deposition  (actuator)  with  ion  flux 
(sensor);  and  magnetic  field  (actuator)  with  ion  flux  uniformity  (sensor),  a  control 
strategy  was  developed  which  maintained  the  magnitude  and  uniformity  of  the  ion  flux 
through  the  transient.  These  results  are  shown  in  Figure  8, 


RADIUS  (cm) 


Figure  7  Power  deposition  with  no  magnetic  field  and  a  magnetic  field  of  5  G.  The  magnetic 
field  vectors  are  shown  at  right. 
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Figure  8  Ion  flux  and  uniformity  of  the  ion  flux  resulting  form  a  change  in  wall  sticking 
coefficient  using  2  PID  controllers  having  ICP  power  and  magnetic  fields  as  actuators. 


Another  control  strategy  we  investigated  was  how  to  address  conditions  which 
may  greatly  deviate  from  expected  or  “base  case”  operation.  In  the  discussion  that 
follows,  the  reactor  shown  schematically  in  Figure  9  will  be  used.  This  is  an  inductively- 
coupled-plasma  (ICP)  reactor  excited  by  a  3  coils.  Gas  in  injected  into  the  reactor 
through  a  showerhead  nozzle  and  exhausted  at  the  bottom.  Sensors  include  a  Langmuir 
probe  to  measure  electron  density,  a  surface  electrical  probe  to  measure  ion  current,  a 
mass  spectrometer  and  observations  of  optical  emission  from  three  locations  above  the 
wafer.  The  optical  emission  measurements  will  be  used  to  gauge  uniformity  of  etching 
on  the  wafer.  The  actuators  include  gas  pressure,  ICP  power  deposition  and  the  relative 
amount  of  current  flowing  through  the  3  ICP  coils  which  can  be  used  to  affect  plasma 
uniformity. 
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B-FIELD  (Gauss) 


Figure  9  Schematic  of  the  ICP  reactor  used  to  develop  strategies  for  stabilizing  reactor 
performance  through  transients. 


The  first  example  we  will  discuss  is  for  an  ICP  reactor  operating  in  pure  argon  when  a 
"puff'  of  nitrogen  gas  is  released  into  chamber.  This  simulates  a  transient  failure  in  a 
mass  flow  controller.  The  Ar+  and  electron  densities  during  this  transient  are  shown  in 
Figure  10.  The  conditions  are  10  mTorr  with  250  seem  of  gas  flow.  When  25  seem  of 
nitrogen  are  "puffed"  into  the  reactor,  the  total  electron  density  (now  balanced  by  Ar+  and 
N2  )  decreases  because  N2  dissipates  more  power  in  non-ionizing  collisions  than  does 
argon.  As  the  N2  flows  into  a  larger  volume  of  the  reactor,  the  plasma  density  continues 
to  decrease.  After  the  mass-flow-controller  is  corrected  and  the  N2  stops  flowing,  there  is 
a  clearing  time  for  the  N2  to  exhaust  from  the  reactor,  during  which  the  plasma  density 
recovers  to  its  original  value. 

A  2  x  2  response  surface  based  controller  was  formulated  using  power  and  pressure  as 
the  actuators,  and  electron  density  and  ion  current  to  the  substrate  as  sensors.  The  results 
of  the  control  exercise  are  shown  in  Figure  1 1 .  The  electron  density  is  only  mildly 
stabilized  because  the  controller  has  insufficient  information  on  "future"  conditions. 
Sensor  readings  are  made  when  the  plasma  is  responding  to  a  particular  mole  fraction  or 
spatial  distribution  of  N2.  The  recommended  actuator  settings  are  then  implemented 
based  on  these  sensor  signals.  When  the  N2  density  is  increasing,  conditions  continually 
worsen,  and  so  the  recommended  changes  in  actuator  settings  soon  become  too  small  to 
restore  the  sensors  to  their  target  values.  The  sensor  readings  initially  move  towards  their 
target  values,  but  then  begin  to  deviate.  When  the  N2  density  is  decreasing,  conditions 
continually  improve  after  the  sensor  readings  and  actuator  changes,  and  a  similar 
situation  results,  though  the  sign  of  the  recommended  change  in  actuator  settings 
reverses.  These  are  conditions  which  are  not  well  addressed  by,  for  example,  PID 
controllers. 


Figure  10  (left)  Electron  and  ion  densities  during  a  transient  where  an  errant  mass  flow  controller 
"puffs"  N2  into  an  ICP  reactor. 


Figure  11  (right)  Electron  and  ion  densities  when  control  is  applied  to  stabilize  the  electron 
density  during  a  transient  where  N2  is  "puffed"  into  an  ICP  reactor. 
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Figure  12  Electron  density  in  the  ICP  reactor  with  a  N2  gas  leak  at  times  of  sensor  readings  for 
the  uncontrolled  case,  control  at  the  default  frequency  and  at  twice  the  default  frequency. 


There  are  at  least  3-strategies  to  correct  for  these  changes  in  operating  conditions:  1) 
Increase  the  frequency  of  the  controller  so  that  lack  of  future  information  is  not  as 
critical.  2)  Build  in  knowledge  of  the  characteristics  of  the  perturbed  system  into  the 
controller.  3)  Dynamically  adjust  gain  to  compensate  for  under-  (or  over-)  predicting 
changes  in  actuator  settings.  Option  2  is  difficult  to  employ  if  the  goal  is  to  control 
against  unpredictable  transients.  Option  1  has  physical  limitations  in  that  there  is  a 
practical  upper  limit  to  the  controller  frequency.  For  example,  the  electron  densities  in 
the  ICP  during  the  N2  puff  at  the  time  of  sensor  readings  are  shown  in  Figure  12  for  the 
perturbed  (uncontrolled)  conditions,  with  control  at  the  default  frequency  and  with 
control  twice  the  default  frequency.  The  poor  knowledge  of  the  future  resulting  in  under 
predicting  changes  in  actuator  settings  is  somewhat  compensated  for  by  the  increase  in 
controller  frequency. 

There  is,  however,  a  common  procedure  where  “knowledge  of  the  future”  can  be  used 
to  control  against  transients.  This  procedure  is  change  in  recipe  where,  for  example,  the 
power  or  gas  mixture  are  changed  as  one  transitions  between  the  "main  etch"  and  the 
"overetch".  We  investigated  the  recipe  change  of  switching  from  an  Ar/Cl2=90/10  gas 
mixture  to  an  Ar/Cl2=99/1  mixture.  The  densities  of  Ar,  Cl  and  Cl2  during  the  transient 
(25U  seem,  10  mTorr)  are  shown  in  Figure  13.  Uniformity  will  be  controlled  using 
optical  emission  from  Cl*  from  three  locations  above  the  wafer  as  shown  in  Figure  14. 
The  actuators  will  be  the  relative  amount  of  current  flowing  through  the  inner  and  outer 
coils.  The  problem  is  that  as  the  Cl2  mole  fraction  changes  during  the  transient,  the 
response  surfaces  which  are  used  to  determine  actuator  adjustments  also  change. 
Knowing  ahead  of  time  that  the  transient  involves  a  change  in  CT  mole  fraction  (this  is 
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your  "knowledge  of  the  future),  one  can  use  an  additional  sensor,  in  this  case  a  mass 
spectrometer,  to  measure  the  CI2  mole  fraction.  This  additional  data  is  then  used  to 
interpolate  between  response  surfaces  which  were  prepared  for  different  Cl2  mole 
fractions  in  steady  state  experiments.  This  is  called  using  additional  "planes"  of  response 
surfaces.  Using  more  planes  should  yield  better  results  since  the  coefficients  used  for  any 
given  Cl2  mole  fraction  are  more  representative  of  the  instantaneous  conditions. 
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Figure  13  Densities  of  Ar,  Cl  and  Ci2  during  a  recipe  change  (250  seem,  10  mTorr). 
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Figure  14  Uniformity  of  optical  emission  from  Cl*  during  a  recipe  change  using  1,  2  and  3  planes 
of  control  (250  seem,  1 0  mTorr). 


An  example  of  stabilizing  uniformity  through  a  recipe  change  is  shown  in  Figure  14. 
The  uniformity  parameter  a  without  control  [a  ranges  from  (0,1)  from  poor  to  perfect 
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uniformity]  is  0.68-0.71.  With  control  a  is  improved  to  0.95.  The  length  of  time  which  a 
can  be  held  at  a  high  value  increases  as  the  number  of  planes  of  control  increase. 

A  major  portion  of  our  efforts  were  devoted  to  the  development  of  more 
robust  and  comprehensive  plasma  models,  as  embodied  in  the  HPEM/VPEM,  to  address 
a  wider  variety  and  complexity  of  plasma  tools.  In  work  jointly  funded  by  the  National 
Science  Foundation,  Semiconductor  Research  corporation,  Applied  Materials  and  LAM 
Research  Corporation,  significant  new  capabilities  were  developed,  as  discussed  in  the 
attached  publications.  One  such  development  is  described  here. 

Major  modifications  were  made  to  the  HPEM  to  enable  simulation  of  long¬ 
term  transients  and  pulsed  plasmas.  The  HPEM  was  converted  to  a  moderately  parallel 
code  wherein  each  of  the  major  modules  are  executed  on  different  processors.  This 
enables  plasma  properties  being  updated  in,  for  example,  the  fluid  kinetics  module  to  be 
made  immediately  available,  through  shared  memory,  to  the  electron  Monte  Carlo 
Simulation.  This  methodology  is  schematically  shown  in  Figure  15.  These  algorithms 
were  implement  using  OPEN-MP  protocols  on  a  4-processor  Sun-Microsystems  server. 
This  has  resulted,  to  our  knowledge,  in  the  first  2-dimensional  fully  transient  plasma 
equipment  model. 


Figure  15  Schematic  of  the  parallel  HPEM  which  exchanges  plasma  properties  through  shared 
memory. 


Examples  of  the  results  from  the  parallel  HPEM  are  shown  in  Figs.  16-18. 
The  electron  density  and  plasma  potential  during  pulsed  operation  of  thelCP  GEC 
Reference  Cell  are  shown  in  Figure  16.  The  operating  conditions  are  Ar/Cl2=80/20, 
20  mTorr,  300  W  with  a  30%  duty  cycle  at  10  kHz.  Due  to  enhanced  attachment  in 
the  afterglow  when  power  is  turned  off  producing  a  fall  in  the  electron  temperature, 
the  electron  density  decreases  to  a  few  percent  of  its  peak.  The  plasma  potential  then 
falls  to  a  few  tenths  of  a  volt.  The  plasma  potential  first  drops  rapidly  due  to 
electron  cooling,  then  less  rapidly  due  to  electron  attachment.  As  a  consequence  of 
the  drop  in  plasma  potential  negative  ions,  which  would  normally  be  trapped  by  the 
positive  plasma  potential,  are  able  to  escape  the  plasma  and  reach  the  substrate.  This 
negative  ion  flux  is  important  to  neutralizing  plasma  induced  damage  of 
microelectronic  components. 
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Figure  16  Electron  density  and  plasma  potential  during  pulsed  operation  of  an  ICP  reactor.  The 
pulse  repetition  rate  is  1 0  kHz  and  the  duty  cycle  is  30%. 


For  example,  the  plasma  potential  and  negative  ion  flux  vectors  are  shown  in  Figure 
17  for  conditions  similar  to  those  discussed  above.  The  charged  particle  fluxes  to  the 
substrate  are  shown  in  Figure  1 8.  During  the  period  that  power  is  on,  the  negative  ion 
flux  vectors  point  into  the  plasma  where  negative  ions  are  consumed  by  ion-ion 
neutralization.  When  the  power  is  turned  off  at  50  ps,  the  plasma  potential  collapses. 
The  negative  ion  flux  vectors  then  turn,  in  a  wave  like  manner,  to  pointing  towards 
surfaces,  indicating  that  the  negative  ions  can  escape  from  the  plasma.  As  the  chlorine 
fraction  increases,  the  rate  of  collapse  of  the  electron  density  and  plasma  potential 
increases,  thereby  enabling  negative  ions  to  more  rapidly  escape  from  the  plasma. 
Negative  ion  are  then  incident  on  the  substrate  for  a  longer  period  of  time. 
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Figure  17  Cl-  negative  ion  flux  and  plasma  potential  in  Ar/Cb  plasmas  at  different  times  for  a 
power  of  300  W,  PRF  of  10  kHz  and  duty  cycle  of  50%.  (Cl-  flux  vectors  are  all  of  the  same 
length  and  not  scaled  with  respect  to  magnitude). ,  Results  are  shown  for  a)  0  ps,  b)  5  ps,  c)  10  ps 
and  d)  20  ps  during  the  power-on  period  and  e)  60  ps  and  f)  90  ps  during  the  afterglow.  Negative 
ions  are  extracted  only  after  70  ps  into  the  afterglow. 
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Figure  18  The  temporal  dynamics  of  positive  ion,  electron  and  Cl-  flux  to  the  substrate  as  a 
function  of  CI2  fraction  for  Ar/CI?“  30/70  and  Ar/Cl2=  80/20.  As  the  CI2  fraction  increases, 
negative  ions  can  be  extracted  for  longer  period  in  the  afterglow. 


3  Sensors:  Modeling,  Signal  Processing,  and  Control 

Our  general  strategy  for  plasma  process  control  involved  the  use  of: 

•  process  sensors  for  high-speed  feedback  control,  and; 

•  substrate  (wafer)  sensors  for  high  accuracy  endpoint  detection  and  for  process 
state  to  wafer  state  model  construction  and  verification. 

Our  process  state  sensor  efforts  were  concentrated  on  two  areas:  (1)  measurement  of  the 
electrical  state  of  the  plasma  through  advanced  RF  measurements;  and,  (2)  the 
measurement  of  polymer  precursors  (CF2)  in  CF4  plasmas.  The  major  results  of  the  latter 
work  on  CF2  sensing  has  been  desired  in  prior  technical  reports  in  this  program  and  will 
not  be  repeated  here.  The  RF  measurement  work  proved  to  be  more  successful  in  the 
process  control  work  in  this  program  and  will  be  summarized  in  section  3.1. 

Our  wafer  state  measurements  continue  to  be  focused  on  reflected  light  measurements 
(reflectometry  and  ellipsometry).  Our  major  emphasis  in  this  area  is  to  try  to  extend  our 
previously  reported  successes  in  automate  endpoint  detection  using  Extend  Kalman 
Filtering  to  patterned  wafers. 

3,1  RF  Sensing 

Feedback  control  and  diagnostics,  whether  implemented  in  a  run-to-run  or  real-time 
fashion,  require  sensed  information  about  the  process  and/or  its  environment.  The 
overarching  goal  of  this  project  is  to  obtain  real-time  process  information  through  non- 
intrusive  means.  The  term  RF  sensing  typically  refers  to  the  action  of  measuring  the 
13.56  MHz  RF  signal  used  to  strike  a  plasma  in  a  plasma  etch  or  deposition  tool,  and 
extracting  information  about  deposited  power,  impedeance,  and  strength  of  harmonics 
induced  by  the  nonlinearities  in  the  plasma  and  etch  chamber.  The  first  part  of  our  work 
focused  on  quantifying  in  a  rigorous  manner  the  well-posedness  of  this  measurement  and 
its  potential  for  providing  reliable  process  information.  We  concluded  that  the  RF 
measurement  techniques  used  at  the  time  in  industry  and  academia,  and  marketed  by 
several  suppliers,  were  fundamentally  flawed  and  could  not  be  ameliorated  in  any 
practical  manner.  Our  research  in  the  MURI  Program  has  done  much  to  explain 
limitations  of  previous  work  in  this  area.  We  therefore  embarked  on  the  development  of  a 
novel  sensor  based  on  plasma  impedance  spectroscopy.  This  second  phase  of  our 
research  resulted  in  the  design  of  a  minimally-intrusive,  Broadband  RF  sensor  that  has  a 
one  hundred-  (100)  to  one  thousand-  (1,000)  fold  increase  in  sensitivity  to  conditions  in 
the  plasma.  This  sensor  operates  in  the  300  MHz  to  2  GHz  range  and  responds  to  changes 
in  deposited  power,  chemical  concentration,  pressure,  and  even  certain  aspects  of  the 
wafer’s  state.  In  the  third  phase  of  our  research,  we  implemented  this  sensor  on  our  Lam 
etch  tool  and  developed  signal  processing  techniques  to  extract  information  from  the 
Broadband  RF  sensor.  We  demonstrated  the  combination  of  the  sensor  and  algorithms  on 
Si  etch  rate  extraction  and  actuator  fault  detection  and  isolation.  This  section  details  our 
accomplishments  in  RF  sensing  over  the  period  of  this  contract. 
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3.1.1  RF  Sensing  at  13.56  MHz 


Plasma  processing  specialists  in  industry  and  academia  had  recognized  that  a  substantial 
amount  of  information  about  the  plasma  state  should  be  contained  in  the  RF  signal  (13.56 
MHz)  and  its  harmonics.  On  the  surface,  making  current  and  voltage  measurements 
should  be  straightforward,  and  the  real  work  should  lie  in  developing  the  mathematical 
models  between  measurements  and  key  plasma  quantities  (absorbed  power,  electron 
concentration  and  temperature,  ion  concentration,  etc.).  Unfortunately,  this  is  not  the 
case.  Work  reported  in  this  area  from  Stanford,  Berkeley,  MIT,  Texas-Austin  and  NIST, 
as  well  both  US  and  Japanese  chip  manufacturers,  has  revealed  that  the  RF  sensing 
problem  itself  at  the  13.56  drive  frequency  is  non-trivial.  In  particular,  one  of  our  original 
industrial  partners.  Optical  Imaging  Systems,  tried  off-the-shelf  RF  technology,  and  it 
failed. 

Our  work  has  investigated  the  fundamental  issues  that  determine  the  limits  on  the 
accuracy  with  which  RF  signals  can  be  measured  in  a  plasma  environment.  We  have 
investigated  this  with  respect  to  properties  of  the  reactor  (high  density,  low  density, 
pressure  range  and  type  of  processing  gas),  the  choice  of  RF  probe  technology  (coupler 
based  or  current  voltage  based),  and  with  respect  to  the  receiver  technology  (vector 
network  analyzer,  power  meter,  vector  voltmeter  or  digital  oscilloscope).  We  have  shown 
that  the  single  most  important  factor  affecting  the  success  of  RF  monitoring  at  these 
frequencies  is  the  voltage  standing  wave  ratio  (VSWR)  of  the  RF  power  system.  The 
VSWR  is  determined  by  the  reactor’s  density,  pressure  range  and  process  gases.  A 
typical  low-density  (single  source,  capacitively  coupled)  plasma  system  at  20  mTorr  with 
fluorocarbon  chemistry,  as  is  commonly  used  in  the  display  industry,  will  have  a  VSWR 
of  around  50.  Under  these  conditions,  we  have  shown  that  an  inaccuracy  in  the  raw 
voltage  and  current  measurement  at  the  sensor  is  magnified  one  hundred  fold  in  the 
computation  of  deposited  power.  The  second  most  important  factor  in  determining 
accuracy  is  the  quality  of  the  model  that  relates  sensor  measurements  to  the  measured 
electrical  state.  Our  work  has  revealed  that  a  directional  coupler  in  tandem  with  a  high 
quality  receiver  is  inherently  more  linear  than  a  current  voltage  probe  with  an 
oscilloscope.  The  relationship  between  the  true  electrical  state  as  established  by  a 
network  analyzer  and  the  measurement  of  the  directional  coupler  can  be  modeled  very 
accurately  by  a  two  port  electrical  network.  Other  choices  of  sensor-receiver 
combinations  turn  out  to  be  quite  nonlinear,  and  require  an  inordinate  number  of  data 
points  for  calibration  (i.e.  accurate  model  building). _ 


Raw  Data 

Model  Based  Correction 

Laboratory  Quality  Measurements 

4.28% 

0.98% 

Realistic  Probe  Receiver  Combination 

21.30% 

12.8% 

Common  Practice 

122.40% 

145% 

Table  1  Comparison  of  relative  errors  in  computed  power  at  a  VSWR  of  50.  The  realistic  probe 
receiver  combination  consists  of  a  pair  of  power  meters  and  a  directional  coupler.  A  dual  channel 
power  meter  should  further  reduce  the  error  by  half  over  a  pair  of  power  meters. 


We  also  developed  two-port  models  for  the  reactor  chamber  and  its  power  connections, 
allowing  us  to  de-embed  it  from  the  electrical  measurements,  and  hence  accurately 
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estimate  the  power  absorbed  by  the  plasma  itself  (instead  of  the  plasma  and  chamber),  as 
well  as  the  plasma  impedance.  This  in  turn  has  led  to  direct  feedback  control  of  the 
plasma  via  electrical  measurements  and  the  on-line  estimation  of  quantities  such  as 
sheath  thickness  and  electron  concentration  on  our  Applied  Materials  8300  reactor. 

Armed  with  our  knowledge  of  the  limitations  of  RF  sensing  at  13.56  MHz,  we 
investigated  the  use  of  the  sensor  for  closed-loop  control  of  the  power  deposited  in  the 
plasma.  The  fundamental  idea  was  to  control  the  power  delivered  to  the  plasma 
independently  of  losses  in  the  delivery  mechanism.  The  realization  of  the  control  system 
requires  highly  accurate  sensing  of  the  power  delivered  to  the  plasma  chamber.  As 
summarized  above,  extensive  investigation  of  this  sensing  problem  quantified  the 
inherent  difficulty  in  measuring  power  between  the  matching  network  and  the  powered 
electrode  of  the  chamber.  It  was  shown  that  both  power  losses  and  measurement  accuracy 
scale  with  plasma  impedance,  when  expressed  as  a  standing  wave  ratio.  As  a  result, 
measurements  are  least  accurate  when  power  losses  are  highest  and  an  accurate 
measurement  is  most  needed.  This  makes  the  control  problem  very  difficult.  Our  research 
into  closed-loop  delivered  power  control  led  to  the  following  conclusion:  Closed-loop 
delivered  power  control  at  best  provides  an  incremental  benefit  at  substantial  cost.  At 
worst,  closed-loop  delivered  power  control  can  easily  result  in  a  deterioration  of  process 
performance  and  increase  process  variability. 

Our  work  on  RF  sensing  at  13.56  MHz  is  reported  in 

•  C.  Garvin,  D.  S.  Grimard,  and  J.  W.  Grizzle,  “RF  Sensing  Calibration  for  Real 
Time  Control  of  Plasma-Based  and  Etching,”  1998  International  Conference 
Characterization  and  Metrology  for  ULSI  Technology ,  NIST,  Gaithersburg,  MD, 
March  23-27, 1998 

•  Garvin,  C.,  Grimard  D.  S.,  Grizzle,  J.  W.,  and  Gilchrist,  B.  E.,  “Measurement  and 
Accuracy  Evaluation  of  Electrical  Parameters  at  Plasma  Relevant  Frequencies  and 
Impedances,”  Journal  of  Vacuum  Science  and  Technology  A,  Volume  16,  Number 
2  Mar/Apr  1998,  pp  595-606. 

•  C.  Garvin,  D.  S.  Grimard,  and  J.  W.  Grizzle,  “The  Impact  of  Receiver 
Performance  on  the  Determination  of  Electrical  Parameters  at  Plasma  Relevant 
Frequencies  and  Impedances”,  submitted  to  Journal  of  Vacuum  Science  and 
Technology. 

•  H.-M.  Park,  C.  Garvin,  D.  S.  Grimard,  and  J.  W.  Grizzle,  “Control  of  Ion  Energy 
in  a  Capacitively  Coupled  Reactive  Ion  Etcher,55  Journal  of  the  Electrochemical 
Society ,  vol.  145,  no.  12,  pp.  4247-4252,  1999. 

3.1.2  Broadband  RF  sensing  based  on  plasma  impedance  spectroscopy 

Having  established  the  fundamental  limitations  on  conventional  approaches  to  plasma 
sensing  through  monitoring  of  the  fundamental  of  the  13.56  RF  signal,  we  extended  our 
research  to  consider  RF  measurement  in  a  more  general  context.  One  of  the  standard 
methods  of  approaching  the  problem  is  illustrated  in  Figure  19  (A).  It  has  long  been 
known  that  an  RF  plasma  generates  harmonics  in  the  RF  signal,  and  that  these  harmonics 
change  as  a  function  of  plasma  condition.  However,  from  a  control  and  diagnostic  point 
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of  view,  the  key  question  is  not  whether  they  change,  but  whether  they  change  enough 
(sensitivity),  and  thus,  whether  there  is  sufficient  information  content  (noise  immunity)  in 
the  signal  to  extrapolate  process-relevant  information  about  conditions  in  the  plasma 
chamber? 


Frequency  Frequency 


(A) 


(B) 


Figure  19  (A)  Plasma  diagnostic  via  harmonic  content  of  RF  signals;  (B)  Actual  harmonic 
response,  showing  poor  sensitivity.  Figure  19  (B)  shows  the  fundamental  and  5  harmonics  of 
forward  and  reverse  voltage  of  the  RF  signal,  for  two  different  plasma  chemistries  (i.e., 
conditions).  We  see  from  the  figure  that  standard  harmonic  sensing  has  very  poor  sensitivity  to 
this  type  of  plasma  characteristic.  This  motivated  us  to  consider  an  alternate  approach  to  the  RF 
diagnostic  problem. 


Since  the  early  1950’s,  high  frequency  excitation  by  means  of  a  resonance  probe  has  been 
used  to  extract  parameters  from  ionospheric  plasmas.  This  method  is  readily  adaptable  to 
a  processing  plasma,  as  shown  in  Figure  20  (A).  A  probe  is  inserted  in  the  plasma  and 
driven  over  a  wide  frequency  range  using  a  network  analyzer,  resulting  in  extremely 
accurate  plasma  measurements.  The  standard  practice  using  this  method  has  been  to  only 
determine  the  resonant  frequency,  for  the  purposes  of  estimating  the  plasma  density.  We 
immediately  determined  that  there  was  valuable  information  throughout  the  frequency 
spectrum  and  modified  the  technique  to  capture  and  process  the  entire  frequency  range  - 
thus  the  appellation,  non-eontrollee,  “broadband”. 
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(A)  (B) 


Figure  20  (A)  Resonance  probe  implementation;  (B)  Resonance  probe  response. 


The  graph  in  Figure  20  (B)  shows  a  plot  of  impedance,  expressed  as  reflection  coefficient 
vs.  frequency,  for  the  same  two  plasma  conditions  as  before.  Below  about  150  MHz,  the 
two  response  curves  are  very  similar.  In  the  range  150  to  700  MHz,  the  difference  is  very 
substantial.  Qualitatively,  at  least,  it  appears  that  the  “broadband”  approach  is  more 
likely  to  provide  relevant  and  reliable  information  about  the  process. 

A  relatively  large-scale  experiment  on  a  research  reactor  was  undertaken  to  confirm  the 
indications  of  Figure  19  (B)  and  Figure  20  (B).  The  goal  of  the  experiment  was  to 
reconstruct  the  power,  pressure  and  chemistry  set-points  of  the  plasma  from  the 
measurement  of  broadband  and  (standard)  narrow  band  (harmonic)  responses, 
respectively.  The  experiments  were  factorial  (3  factors,  3  levels),  using  pure  Ar,  pure  02 
and  50%  Ar-  50%  02  gas  mixtures,  pressure  at  100,  175  and  250  mTorr,  and  power  at 
100,  110,  and  120  Watts,  15  semi-random  repetitions  of  the  experiment  were  performed. 
The  model  was  developed  using  a  stepwise  regression  on  10  repetitions.  In  both 
broadband  and  narrow  band  data,  training  data  was  augmented  by  additional  runs 
incorporating  an  offset  equal  to  the  measurement  uncertainty  in  the  instruments  used  to 
collect  the  data.  This  method  ensured  that  only  measurements  above  the  uncertainty  limit 
of  the  instrument  were  used  in  the  fitting  process. 

The  fit  performance  is  determined  by  applying  the  model  developed  on  the  1 0  repetitions 
to  predict  the  5  remaining  repetitions.  Broadband  data  was  collected  using  an  intrusive 
antenna  driven  by  and  HP8753D  network  analyzer  over  a  27.5  MHz  to  2.75  GHz 
bandwidth.  Narrow  band  data  was  obtained  using  a  Werlatone  D5281  2  MHz  to  250 
MHz  directional  coupler  and  HP8592L  Spectrum  Analyzer.  Performance  is  summarized 


22 


in  Table  2,  The  results  of  this  initial  experiment  confirmed  that  the  broadband  sensor 
indeed  has  substantial  potential  as  a  diagnostic  tool  and  warrants  further  investigation  on 
relevant  manufacturing  processes. 


Pressure 

Chemistry 

Power 

broad  band 

97.9  % 

98.6% 

83% 

narrow  band 

65% 

88% 

76% 

Table  2  Results  of  initial  broad  band  versus  narrow  band  experiment. 


Figure  21  Experimental  setup  for  Broadband  vs.  narrow  band  comparison 

The  initial  test  of  the  new  sensor  was  followed  up  with  the  experimental  setup  shown  in 
Figure  21.  Plasma  conditions  of  power,  pressure  and  chemistry  were  varied  in  a  full 
factorial  experiment.  Electrical  measurements  were  obtained  simultaneously,  using  both 
broadband  and  “best-of-breed”  conventional  narrow  band  sensing  systems.  An  empirical 
model  was  regressed  on  half  of  the  experimental  data.  This  model  was  then  employed  in 
combination  with  electrical  signals  from  the  remaining  experimental  data  to  predict  the 
plasma  conditions  in  terms  of  power,  pressure  and  chemistry.  Results  were  dramatic. 
Whereas  the  best  narrow  band  model  achieved  a  best  fit  of  R2  =  0.714,  averaged  over  the 
three  parameters,  the  broadband  approach  achieved  a  nearly  perfect  R2  =  0.971. 
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Figure  22  Processing  Compatible  Non-Contacting  Broad  Band  Probe 


Having  established  the  viability  of  the  broadband  approach,  we  proceeded  with  the 
design  of  a  process-compatible  realization  of  the  sensor.  A  new  modular  non-contacting 
system,  shown  in  Figure  22,  was  designed  and  implemented  on  our  Lam  9400  TCP  and 
Plasmatherm  cluster  tool.  A  quartz  tube  insulates  the  antenna  from  the  discharge,  thus 
eliminating  the  problem  of  contamination.  An  experiment  was  performed  in  our  research 
reactor  to  compare  the  new  “non-contacting”  approach  to  the  existing  “contacting” 
approach.  The  new  system  in  fact  demonstrated  better  sensitivity  to  plasma  parameters 
than  the  previous  “contacting”  approach. 

Our  work  on  broadband  RF  sensing  is  reported  in 

•  Garvin,  C.,  Grimard  D.  S.,  and  Grizzle,  J.  W,,  Advances  in  Broad  Band  RF 
Sensing  for  Real-Time  Control  of  Plasma-Based  Semiconductor  Processing, 
Journal  of  Vacuum  Science  and  Technology  A,  Volume  17,  Number  4  Jul/Aug 
1999,  pp  1377-1383. 

•  Garvin,  C.,  Bilen  S.  G.,  Stutzman,  B.  S,,  and  Grizzle.  J.  W.„  "Implementation  of 
Broad  Band  RF  Sensing  on  a  Lam  9400  Reactor"  ,  The  Electrochemical  Society 
1 95  Ul  Meeting,  May  2-7,  1 999,  Seattle,  Wa. 

•  C.  Garvin  and  J.W.  Grizzle  A  Demonstration  of  Broadband  RF  Sensing: 
Empirical  Polysilicon  Etch  Rate  Estimation  in  a  Lam  9400  Etch  Tool,  to  appear  in 
Journal  of  Vacuum  Science  and  Technology  A. 

3.1.3  Fault  Detection  Using  the  Broadband  Sensor 

Early  and  accurate  detection  of  a  sensor  fault  is  critical  to  decreasing  semiconductor 
device  production  cost  and  to  shortening  the  manufacturing  cycle.  Mechanical  system 
failure  is  often  easy  to  detect  by  visual  inspection.  However,  sensor  failure  poses  a  much 
more  difficult  problem.  An  unobserved  drift  in  the  deposited  power,  chamber  pressure,  or 
gas  flow  rates  can  significantly  impact  the  etch  process,  yet  it  is  much  more  difficult  to 
detect  and  identify  the  source  of  such  variations.  We  therefore  investigated  the  use  of  the 
Broadband  Radio  Frequency  (RF)  sensor  to  detect  faults  in  the  deposited  Transformer 
Coupled  Power  (TCP)  and  pressure  sensor  in  our  Lam  9400  TCP  tool.  The  distinctive  RF 
fingerprint  of  the  process  from  the  Broadband  sensor  is  rich  enough  to  distinguish  plasma 
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variations  due  to  faults  in  the  measurement  of  TCP  and  pressure  from  variation  in  the 
outputs  of  the  remaining  sensors.  A  non-parametric  sign  test  was  used  to  detect  the 
occurrence  of  sensor  faults  on  the  basis  of  the  Broadband  RF  signal  observation. 

This  research  was  conducted  using  a  low-pressure,  high-plasma  density  Lam  TCP 
9400SE  plasma  etching  system  and  an  in-house-constructed  Broadband  RF  sensor.  The 
Broadband  RF  sensor  consists  of  a  2.54-cm  long  tungsten  probe  tip  inserted  in  an 
aluminum  cylinder  and  contained  in  a  quartz  tube.  A  Hewlett  Packard  8753B  Vector 
Network  Analyzer  was  used  to  measure  the  complex  reflection  coefficient,  T.  A  6-inch 
poly-Si  wafer  was  etched  under  the  Main  Etch  (ME)  condition  using  CI2  and  HBr  gases. 
It  was  assumed  that  the  sensor  faults  occur  one  at  a  time,  so  the  five  machine-input 
variables,  TCP  and  Bias  power,  pressure,  and  CI2  and  HBr  flow  rates,  were  changed  one 
at  a  time  from  their  nominal  values  to  observe  their  distinctive  plasma  fingerprints  via  the 
Broadband  RF  sensor.  TCP  and  Bias  power  and  HBr  flow  rate  were  changed  +/-  10,  +/- 
15,  and  +/-  25  %  from  their  nominal  values  of  250  W,  180  W,  and  75  seem,  respectively. 
Since  the  pressure  and  CI2  flow  rate  settings  are  relatively  small  values,  10  mTorr  and  15 
seem,  respectively,  their  nominal  values  were  changed  by  +/-  20,  +/-  30,  and  +/-  40  %. 
3.1.3.1 

Figure  23  shows  a  typical  response  of  the  Broadband  RF  sensor  to  a  variation  in  TCP. 


Frequency  (GHz) 


Figure  23  Variation  of  RF  peak  with  TCP. 


Two  things  can  be  noted:  there  are  two  dominant  peaks  in  the  reflection  coefficient  for 
each  power  setting,  and  there  is  a  very  nearly  linear  increase  in  the  peak  frequencies  with 
TCP,  A  single  sweep  of  the  Broadband  RF  measurement  results  in  402  real  numbers;  201 
data  points  of  magnitude  and  201  data  points  of  phase  information  of  the  reflection 
coefficient  (T).  As  a  first  step  in  data  reduction,  an  RLC  circuit  parameterization  of  the 
RF  response  was  performed.  Each  peak  of  the  RF  response  was  modeled  as  a  series  RLC 
circuit  (fl>n,  Q,  R)  resulting  in  a  significantly  reduced  data  size  (Figure  24). 
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Figure  24  RLC  circuit  parameterization  result 


Next,  a  regression  model  of  TCP  centered  about  its  nominal  value  was  determined  as  a 
function  of  the  measured  RF  values.  A  regression  model  was  determined  to  be 

TCPest  =  0.1 985+0. 8574con2-0.6529Q|+0.95 10  Q,  conl>  (1) 

with  R2=0.94.  Figure  25  illustrates  the  estimation  of  TCP  modeled  with  RF  sensor 
information.  Ideally,  the  estimated  TCP  would  be  a  straight  line  in  the  region  labeled  as 
“TCP”  since  only  TCP  was  varied  here.  Nevertheless,  the  modeling  error  is  within  +/-  5 
%.  A  TCP  fault  was  declared  if 

|TCPnom-TCPest|/TCPnom>y  (2) 
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Figure  25  Normalized  minimum  and  maximum  values  of  the  estimated  TCP  (top)  and  Pressure(bottom) 

where  TCPest  is  a  normalized  estimated  TCP  value,  TCPn0m  is  a  normalized  nominal  TCP 
setting  which  is  1 ,  and  y  is  a  threshold  to  be  determined.  We  assumed  that  the  probability 
of  the  individual  sensor  error  occurrence  equals  10"4,  and  that  these  events  were  mutually 
exclusive,  and  equally  likely  for  the  further  calculation  of  the  probability  of  detection 
(Pd)  and  probability  of  false  alarm  (PF).  Figure  26  illustrates  the  performance  of  the  TCP 
fault  detector  in  terms  of  a  graph  of  the  Pd  versus  PF.  The  TCP  fault  detector  provides 
excellent  performance  when  TCP  deviation  by  TCP  measurement  sensor  error  is  more 
than  +/-  8  %  from  its  nominal  value.  Next,  the  regression  model  of  pressure  was 
determined  as 

Pressurest=0.4389+1. 593 1(onl- 1.661 10ES+0.1451co„,Q,  +0.4862  <on2OES  (3) 
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Figure  26  Receiver  Operating  Characteristic  (ROC)  of  TCP  (top)  and  Pressure  (bottom)  sensor 
fault  detector. 


with  R2=0.845.  The  pressure  modeling  error  is  about  +/-  10  %  (Figure  25).  A  pressure  fault 
was  declared  using  the  same  form  as  eq.  (2).  A  Pressure  sensor  fault  was  declared  if 


|Pressurenom-Pressureest|/Pressurenom  >  y  (2) 

More  than  +/-  20  %  (+/-  2  mTorr  in  this  study)  deviation  caused  by  the  pressure  sensor 
fault  could  be  detected  with  near  perfection:  Pd=0.99  when  Pf=10"4.  To  check  the 
robustness  of  the  TCP  and  pressure  sensor  fault  detector,  a  validation  experiment  was 
performed  5  months  later  in  the  same  etching  system  (Figure  27).  Two  false  alarms  of 
pressure  sensor  fault  detection  occurred  when  chlorine  flow  rate  was  low.  This  was  due 
to  the  modeling  error  of  pressure  estimation.  Overall,  fault  detection  utilizing  the 
Broadband  RF  sensor  was  proven  an  accurate  and  reliable  method  for  the  TCP  and 
pressure  sensors. 
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Figure  27  TCP  and  Pressure  sensor  fault  detector  validation  experimental  result. 

This  work  was  reported  in 

•  Hyun-Mog  Park,  Dennis  S.  Grimard,  Jessy  W.  Grizzle,  and  Fred  Lewis  Terry,  Jr., 
“Etch  profile  control  of  high-aspeet,  deep  submicron  a-Si  gate  etch,”  IEEE 
Transactions  on  Semiconductor  Manufacturing,  14,  pp  242-254  (2001).. 

3.2  Wafer  State  Sensing 

3.2.1  Introduction 

Under  funding  from  this  program  and  the  end  of  our  SRC  program,  we  independently 
developed  and  made  the  first  open,  referred  publication  on  the  use  of  specular-mode, 
spectroscopic  reflected  light  measurements  from  gratings  for  the  extraction  of  critical 
dimensions  and  detailed  topography. 1  Similar  work  was  being  carried  out  in  proprietary 
industrial  laboratories2  unknown  to  us  at  that  time.  Since  these  early  results,  this  area  has 
exploded  into  a  significant  metrology  area  in  the  Si  integrated  circuit  industry  and  this 
method  is  being  employed  now  as  an  in-line  metrology  technique  for  wafer-by-wafer 
process  control.3,4  3  6 
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Under  funding  from  this  MURI  program,  we  demonstrated  the  1st  and  (at  this  writing) 
still  the  only  applications  of  this  method  to  real-time,  in  situ  monitoring  of  fabrication 
processes  (in  our  case,  reactive  ion  etching).  We  also  performed  the  1st  and  only 
experiments  to  control  the  etch  of  structure  to  a  final  target  critical  dimension  value.7,8’9 
The  control/signal  processing  methods  used  to  achieve  these  results  will  be  described  in 
section  4.1.  In  this  section,  we  will  review  the  physics  of  the  topography  extraction 
method  including  advances  made  since  the  last  annual  report. 

3.2.2  Topography  Extraction  Example:  Photoresist  Lines 

We  have  experimentally  investigated  a  number  of  -350  nm  structures  using  wafers  with 
large  area  700  nm  photoresist  gratings  on  31.7nm  of  Si  on  a  single  crystal  Si  wafer. 


The  refractive  index  vs.  wavelength  of  the  photoresist  was  extracted  from  a  similarly 
processed  unpatterned  film  using  spectroscopic  ellipsometry  at  4  angles  of  incidence  (60, 
65,  70,  75°)  to  reduce  the  artifacts  produced  at  antireflection  points.  The  period  of  the 
grating  was  verified  by  measuring  the  angle  of  the  1st  order  diffraction  peak  as  a  function 
of  wavelength  under  an  illumination  angle  of  7°.  This  data  was  fit  to  the  classic  grating 
equation: 

sin(#fi )  =  -y-  +  sin(6>/ ) 

A 

where  A  is  the  grating  period,  X  is  the  wavelength,  9.  is  the  angle  of  incidence,  6n  is  the 

angle  of  diffraction,  and  m  is  the  diffraction  order  (m=l  for  the  A  extraction).  This 
extraction  and  an  SEM  photo  of  a  representative  grating  are  shown  in  Figure  28. 


Figure  28  SEM  cross-sectional  image  of  350  nm  line/space  grating  on  31.7nm  of  SiC>2  on  Si 
(left).  The  grating  period  was  verified  to  be  700,14±0.33nm  using  measured  Is'  order  diffraction 
peaks  vs.  wavelength  with  an  illumination  angle  of  7°  and  fits  to  the  grating  equation  (right 
figure). 


Ex  situ  spectroscopic  ellipsometry  measurements  were  collected  using  a  Sopra  GESP-5 
rotating  polarizer  ellipsometer  with  a  photomultiplier/high-resolution  scanning 
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monochromator  detection  system.  Data  were  collected  at  7,  63.5,  and  73°  AOFs  for 
comparison  with  in  situ  data  from  etch  systems.  For  this  sample,  the  near-normal 
configuration  proved  to  the  be  most  sensitive,  so  only  that  data  will  be  shown  in  this 
paper.  In  other  sample  cases,  AOI’s  ~60-75°  may  prove  to  be  advantageous. 

The  data  from  both  our  ex  situ  and  in  situ  measurements  were  fit  using  parameterized 
geometric  models  for  the  lineshape,  rigorous  coupled  wave  analysis  (RCWA)  for  the 
specular  scattering  calculations,  and  Levenberg-Marquardt  nonlinear  regression  to 
optimize  the  fit  of  the  simulations  resulting  from  geometric  parameters  vs.  the  measured 
data. 

First,  let  us  examine  a  simple  fit  using  a  trapezoidal  model  and  data  from  the  400-800nm 
range.  This  fit  is  shown  in  Figure  29.  The  quality  of  the  fit  is  good  (though  certainly 
shows  room  for  improvement),  and  all  the  basic  shape  of  the  data  is  approximately 
captured  by  the  fit.  However,  if  the  range  of  the  measurement  is  extend  into  the  UV  (230- 
825nm),  then  the  data  and  trapezoidal  fit  of  Figure  30  result.  Clearly,  there  is  significant 
structure  in  the  shorter  wavelength  data  that  is  not  properly  captured  by  the  simple 
trapezoidal  fit.  This  is  an  indication  that  the  topography  is  more  complex.  Furthermore, 
noting  that  the  trapezoidal  model  yields  sharper,  larger  amplitude  data  in  the  UV  is  an 
indication  that  the  real  structure  is  more  rounded  on  top  than  the  sharp,  relatively  wide 
top  of  the  trapezoid. 


Figure  29  Measured  (solid  curves)  and  fitted  (dots)  results  for  near-normal  7°  AOI  spectroscopic 
ellipsometry  data  (a,p  parameters)  from  400-825nm.  The  best  fit  trapezoidal  model  is  shown  to 
the  right. 
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Figure  30  The  data  from  Figure  29  now  extended  to  230-825nm,  Note  that  there  is  significant 
structure  in  the  measured  data  that  is  not  captured  in  the  fitted  curves  (dots). 


We  proceeded  through  a  succession  of  geometric  models:  (1)  a  trapezoid  on  a  rectangle; 
(2)  a  triangle  on  trapezoid  on  a  rectangle;  and,  (3)  finally  a  3-segment  quadratic  fit 
(basically  a  “triangle”  with  curvature  sitting  on  a  “trapezoid”  with  curvature  sitting  on  a 
second  “trapezoid”  with  curvature).  The  final  fit  is  shown  in  Figure  31.  All  models  are 
overlaid  in  Figure  32  and  the  best  fit  is  shown  overlaying  a  cross-sectional  SEM  photo 
also  in  this  figure. 


Figure  31  The  same  data  as  Figure  30  but  now  fit  using  a  3  segment,  2nd  order  polynomial  fit  of 
the  sidewall  shape.  Note  that  nearly  all  of  the  major  structures  in  the  data  are  fit  by  this  geometric 
model.  The  sources  of  the  residual  errors  are  not  clearly,  but  may  be  related  to  slight  variations  in 
the  refractive  index  of  the  resist  line  vs.  the  (n,k)  reference  data  extracted  form  a  similar  blanket 
test  sample. 
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Figure  32  The  results  of  the  fit  from  Figure  31.  The  left  figure  shows  all  explored  geometric 
models  overlaid  (trapezoid,  trapezoid-on-rectangle,  triangle-on-trapezoid-on-rectangle,  and 
finally  the  3-segment  quadratic  fit).  The  3-segment,  2nd  order  fit  is  overlaid  on  the  cross-sectional 
SEM  photo.  The  only  adjustment  of  image  size  was  a  constant  scaling  to  match  the  measured 
period. 

The  parameters  used  in  this  fit  are  shown  in  Table  3.  For  each  fitted  segment  j,  the  width 
of  segment  at  a  given  point  y  measured  down  from  the  top  of  the  segment  is  given  by: 
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For  this  fit,  we  set  mio=0  for  the  top  curved  triangle.  This  was  done  since  attempts  to 
extract  a  top  width  yielded  numbers  which  were  essentially  zero  and  which  had  no 
statistical  significance.  The  remaining  nijo’s  were  linked  to  the  bottom  width  of  the 
segment  above  so  that  the  sidewall  structure  was  a  continuous  function  of  y.  For  this  fit, 
this  results  in  9  independent  parameters.  The  standard  95.5%  confidence  limits  for  this  fit 
are  also  shown  in  the  table.  All  of  the  fitted  parameter  show  confidence  limits  that  are 
less  than  the  parameters  themselves,  indicating  that  all  have  some  statistical  validity  in 
the  topography  extraction.  However,  the  cross  correlation  coefficients  are  examined 
(Table  4),  we  see  that  there  is  strong  coupling  between  the  heights  of  the  main  middle 
segment  (I12)  and  the  height  of  the  lower  undercut  segment  (113).  Also,  we  see  strong 
coupling  between  the  slope  and  the  curvature  of  the  main  segment  (ni2i  &  m.22).  The  latter 
correlation  is  not  too  surprising  as  the  middle  segment  is  close  to  vertical,  and  thus 
resolution  of  the  slight  curvature  is  difficult.  The  h2-h3  correlation  illustrates  the  difficulty 
in  clearly  resolving  the  slight  undercut.  In  summary  for  this  point,  we  have  now  pushed 
this  data  to  the  limit  of  statistical  merit.  Further  parameterization  could  improve  the 
quality  of  fit,  but  would  not  result  in  topographic  parameters  for  which  we  could  be 
physically  confident. 
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Table  3  Extracted  parameters  for  the  fits  shown  in  Figure  31  and  Figure  32, 


Term 

Value 

95.4% 
conf.  Limit 

Units 

hi 

146.51 

4.55 

nm 

ml  1 

0.7389 

0.0097 

slope 

m!2 

-0.4698 

0.011 

quadratic  curvature 

h2 

545.72 

36.05 

nm 

m21 

0,3461 

0.0272 

slope 

m22 

-0.1921 

0.0282 

quadratic  curvature 

h3  i 

112.35 

34.79 

nm 

m31 

0.0803 

0.0529 

slope 

m3  2 

-0.1933 

0.0659 

quadratic  curvature 

Table  4  Cross-correlation  coefficients  for  the  parameter  in  Table  3.  The  cross-correlations  above 
0.9  indicate  cause  for  concern  in  over-fitting  of  the  data. 


hi 

mil 

ml2 

h2 

m21 

m22 

h3 

m31 

m3  2 

hi 

1 

0.356 

-0.217 

-0.369 

-0.176 

0.121 

0.267 

0.101 

0.04 

mil 

0.356 

1 

-0.88 

-0.34 

-0.31 

0.354 

0.301 

-0.098 

0.219 

ml2 

-0.217 

-0.88 

1 

0.373 

-0.02 

-0.08 

-0.363 

-0.146 

-0.009 

h2 

-0.369 

-0.34 

0.373 

1 

0.512 

-0.527 

-0.993 

-0.369 

-0.108 

m21 

-0.176 

-0.31 

-0.02 

0.512 

1 

-0.981 

-0.493 

0.286 

-0.474 

m22 

0.121 

0.354 

-0.08 

-0.527 

-0.981 

1 

0.517 

-0.31 

0.501 

h3 

0.267 

0.301 

-0.363 

-0.993 

-0.493 

0.517 

1 

0.394 

0.082 

m3 1 

0.101 

-0.098 

-0.146 

-0.369 

0.286 

-0.31 

0.394 

1 

-0.866 

m3  2 

0.04 

0.219 

-0.009 

-0.108 

-0.474 

0.501 

0.082 

-0.866 

1 

In  situ  measurements  and  real-time  control  experiment  were  conducted  using  these 
samples.  A  Lam  9400  TCP  SE  poly  Si  etch  system  was  slightly  modified  to  add  ports  for 
ellipsometry.  Due  to  vacuum  system  constraints,  the  AOI  was  63.5°.  Spectroscopic 
ellipsometry  data  was  collected  using  a  Sopra  RTSE  (real-time  spectroscopic 
ellipsometer)  which  is  a  rotating  polarizer  SE  with  a  prism  spectrometer/CCD  array 
detector  for  high  speed  data  acquisition.  This  system  allows  data  collection  at  a  180  ms 
sampling  rate.  The  SE  measurement  itself  occurs  in  100ms.  This  fast  data  collection  rate 
allows  each  measurement  to  be  treated  as  a  quasi-static  snap  shot  of  the  sample.  The 
gratings  in  these  test  were  rotated  so  that  the  plane  of  incidence  was  perpendicular  to  the 
grating.  Representative  data  from  these  measurements  is  shown  in  Figure  33. 
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Figure  33  Representative  data  from  the  in  situ  RTSE  measurements  of  the  photoresist  grating. 
Measured  data  and  regression  fits  are  shown  (left)  and  the  resulting  topography  fit  (trapezoid  on 
rectangle  model)  is  shown  on  the  right. 


We  used  an  O2  plasma  to  trim-back  the  photoresist  lines.  A  small  library  of  SE 
simulations  vs.  geometries  around  the  expected  topography  trajectory  was  used  to 
determine  the  coefficients  of  a  convex-hull-based  nonlinear  filtering  algorithm  (NLF). 
This  NLF  is  a  sort  of  pattern  matching  approach  with  some  robustness  to  noise  and 
experimental  data  distortions.  It  allowed  rapid  topography  extraction  (-0,25  s  on  a 
600MHz  PHI  PC)  during  the  trim-back  step.  Automatic  endpoint  was  triggered  for  the 
desired  trim-back  to  200  nm  bottom  CD.  The  before  and  after  cross-sections  resulting 
from  this  run  are  shown  in  Figure  34.  This  trapezoid  on  rectangle  fit  yield  a  starting 
condition  of  296  nm  bottom  CD,  169  nm  top  CD,  an  84.2°  upper  segment  sidewall  angle 
and  a  total  line  height  of  777  nm.  The  final  structure  has  a  200  nm  bottom  CD,  71nm  top 
CD,  and  82.1°  upper  segment  sidewall  angle,  and  a  total  line  height  of  697  nm.  A 
“movie”  of  this  process  can  be  viewed  on  the  website 
http://www.eecs.umich.edu/~fredtv/spie2003/sprt7  tritraprect.mov  (Quicktime). 


Figure  34  Real-time,  in  situ  spectroscopic  ellipsometry  extractions  of  beginning  photoresist  shape 
(left)  and  post-trim-back  photoresist  shape  over-laid  with  cross-sectional  SEM  photos. 


Close  examination  of  the  post  trim-back  photo  shows  that  there  is  slight  undercut  of  the 
photoresist  line.  However,  if  we  attempt  to  extract  this  additional  feature  detail  from  the 
RTSE  data,  we  obtain  a  physically  unreasonable  total  line  undercut  (just  a  few  10’s  of  nm 
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on  the  bottom)  and  much  larger  confidence  limits  than  the  fitted  parameters  themselves. 
This  limited  accuracy  condition  arises  because 

1.  The  RTSE  data  is  more  limited  in  wavelength  (shortest  usable  wavelength 
~300nm)  than  the  ex  situ,  scanning  system  measurements; 

2.  The  RTSE  data  is  somewhat  distorted  due  to  stray  light  effects  in  the  prism 
spectrometer  which  reduce  the  intensity  and  sharpness  of  the  structure  of  the  (a,p) 
data. 

The  wavelength  limitation  issue  is  qualitatively  obvious,  but  further  work  will  be 
required  to  quantitatively  understand  the  resolution  limits  of  this  measurement  technique. 
This  will  be  area  of  continuing  research  interest.  The  distortion  issue  in  using  the  RTSE 
data  was  one  of  the  primary  reasons  that  sophistication  was  required  in  the  signal 
processing/control  methods  for  making  use  of  this  data  for  endpoint  detection.  This 
methods  will  be  discussed  in  section  4.1 . 

3.2.3  Challenges  and  Future  Work 

Specular  mode,  spectroscopic  reflected  light  measurements  (sometime  called  specular 
scatterometry)  have  moved  rapidly  from  the  research  world  to  industrial  application. 
These  methods  are  being  used  presenting  in  integrated  circuit  fabrication  control  on  an 
in-ime/wafer-to-wafer  basis.  In  that  regard,  it  is  an  major  success  which  has  at  least 
partially  been  pushed  forward  by  this  program.  However,  several  fundamental  issues  still 
remain: 

1 .  Developing  a  quantitative  understanding  of  the  resolution  limitation  of  various 
possible  extraction  and  measurement  modes. 

2.  Developing  practical  method  for  measurement  of  aperiodic  structures  and  the 
extension  of  real-time  results  demonstrated  under  this  program  to  product  wafers 
(for  true  real-time,  in  situ  control  of  topography). 

3.  Pushing  the  resolution  limits  into  the  true  nano-scale  regime. 

4.  Developing  measurement  methods  for  sparse  (isolated)  structures. 
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4  Control 


4.1  Real-Time  State  Estimation  for  Patterned  Wafers 

4.1.1  Introduction 

Real-time  estimation  and  control  of  semiconductor  wafer  topography  is  becoming 
increasingly  crucial  as  pattern  dimensions  of  modern  integrated  circuits  continuously 
shrink.  Our  work  is  concerned  with  the  problem  of  real-time  estimation  of  the  profile  of  a 
periodic  grating  during  plasma  etching  with  optical  observation.  Difficulties  in  this 
problem  are  that  no  control-oriented  dynamic  models  for  plasma  etching  are  available, 
and  that  the  existing  observation  model  is  of  high  computational  complexity.  Due  to  these 
difficulties,  standard  approaches  such  as  extended  Kalman  filtering  cannot  be  applied. 

Our  approach  is  to  develop  a  new  state  estimation  technique  that  overcomes  unknown 
dynamics  and  computationally  complex  observation.  We  propose  a  nonlinear  filtering 
algorithm  for  unknown  dynamics,  and  high-complexity,  high-dimensional  observation. 
Experimental  results  with  the  proposed  algorithm  include  the  first  reported  successful 
real-time  monitoring  of,  and  end  point  detection  for,  patterned  wafer  topography 
evolution  during  plasma  etching.  The  algorithm  is  shown  to  produce  state  estimates  with 
bounded  error  under  bounded  disturbances  and  redundant  observations,  and  to  reduce  to 
the  maximum  a  posteriori  probability  estimator  under  an  ideal  case.  Finally,  a  version  of 
the  proposed  algorithm  is  used  to  obtain  angle  of  incidence-free  real-time  wafer  state 
estimates. 

4.1.2  Motivating  Problem  and  Previous  Work 

Reactive  ion  etching  (RIE)  is  an  important  plasma  etching  technique  commonly  used  in 
VLSI  and  ULSI  fabrications.  However,  the  underlying  physics  and  chemistry  in  RTF,  are 
known  to  be  very  complex  and  multi-variable,  and  they  are  not  very  well  understood.  In 
fact,  no  control-oriented  physics  based  dynamic  models  for  the  patterned-wafer 
topography  evolution  during  RIE  are  not  currently  available.  On  the  other  hand,  since  the 
process  should  not  be  disturbed,  many  of  the  crucial  parameters  cannot  be  directly 
measured  in  real  time,  so,  for  real-time  estimation  and  control  of  patterned  wafer 
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parameters,  the  RIE  process  needs  to  be  coupled  with  a  non-destructive  in  situ  sensor  that 
is  sensitive  to  etch  parameters. 

Optical  measurement  techniques  such  as  spectral  reflectometry  and  spectroscopic 
ellipsometry  are  the  most  favorable  techniques  for  indirectly  determining  the  surface 
properties  of  a  sample  being  processed  because  of  their  non-destructive  nature  and 
sensitivity,  and  have  become  widely  used  methods  in  various  applications.  Consider  the 
situation  where  a  light  beam  is  incident  to  a  sample  surface  at  a  fixed  incidence  angle. 
According  to  elementary  optics,  after  reflection  on  a  sample  surface,  a  linearly  polarized 
single-wavelength  light  beam,  with  wavelength  X,  becomes  elliptically  polarized 
resulting  in  complex  reflection  coefficients  Rp{X)  and  Rx  (X) .  One  of  our  optical  in  situ 

sensors,  the  Two  Channel  Spectroscopic  Reflectometer  (2CSR)10, measures  |  Rp(X)  |2  and 

|  Rs  (X)  | 2  over  multiple  wavelengths  X  to  extract  the  properties  of  a  patterned  surface. 


N - H  Top  Width 


Figure  35  Trapezoidal  periodic  grating 

We  are  focused  on  the  RIE  process  on  a  photoresist  grating  with  the  2CSR  observation. 
As  in  Figure  35,  the  profile  of  a  photoresist  grating  is  approximated  to  be  trapezoidal.  In 
this  case,  given  the  grating  period,  the  wafer  state  xk  at  time  k  is  defined  to  be  the  triplet 

of  Thickness,  Top  Width,  and  Wall  Angle.  The  measured  2CSR  output  zk  at  time  k  is 
the  21  -tuple  of  |  R,,(Xt)  |2,  |  RX(A,)  |2,  ...,  |  Rp(Xf)\2,  I  Rx(X()\2 ,  when  i  wavelengths 
A, ,  X,  are  considered.  Let  X  c  R3  be  the  space  of  all  possible  wafer  states.  Then  a 

function  h :  X  —>  R2  ,  such  that  h(x)  is  the  model  output  corresponding  to  a  wafer  state 
x  €  X ,  is  obtained  from  the  rigorous  coupled  wave  analysis  (RCWA)11.  The  function  h 
is  not  in  a  closed  analytic  form,  but  rather  given  by  a  complex  numerical  code,  so  the 
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computational  demand  of  the  RCWA  model  is  very  high,  and  direct  use  of  h  is 
impossible  in  real-time  applications.  Indeed,  given  xk ,  evaluation  of  the  model  output 

y*  =  Kxk )  takes  up  to  a  minute  on  a  high-end  workstation;  on  the  other  hand,  given  zk , 
the  nonlinear  least  squares  model  fitting  that  minimizes  ||  zk  -  h(x)  ||  over  xsX  takes 
about  1 5  minutes  on  the  same  machine. 

In  the  case  of  unpatterned  wafers,  one  can  use  a  low-complexity  optical  observation 
model,  and  real-time  etch  rate  estimation  is  possible  via  extended  Kalman  filtering  with  a 
random-walk  approximation  of  the  etch  rate  evolution12.  However,  in  the  case  of 
patterned  wafers,  due  to  the  high  computational  complexity  of  the  RCWA  model,  the 
common  practice  has  been  to  assume  that  the  dynamics  is  completely  unknown  and  resort 
to  ex  situ  nonlinear  least  squares  model  fitting  to  extract  the  wafer  state  trajectory13’14. 
Although  low-complexity  approximation  of  the  RCWA  model  has  been  attempted  using 
neural  network  models  trained  with  simulated  data15,  there  has  been  no  reported 
experimental  validation  of  such  an  approximate  model.  On  the  other  hand,  empirical 
modeling  of  the  2CSR  observation16  has  turned  out  to  be  of  limited  applicability  due  to 
the  huge  number  of  experimental  data  required  to  train  a  neural  network. 

4.1.3  Problem  Formulation 

Let  X  c  R"  be  the  state  space.  Consider  the  discrete-time  state  space  model 
=  fk  (xk )  +  w*  > 

Zk  =  h(xk  +  vk)  +  ek, 
k  =  0,1,..., 

where  xk  e  X  is  the  state  and  zk  e  R'"  is  the  output  at  time  k .  The  functions  fk  are 
unknown,  and  the  function  h  cannot  be  evaluated  in  real  time;  wk  are  input  disturbances, 
vk  represent  a  part  of  systematic  errors  in  modeling,  and  ek  define  the  deviation  of  the 
measured  outputs  zk  from  the  model  outputs  yk  =  h(xk  +  vk)  due  to  measurement  noise 
as  well  as  the  modeling  errors  that  are  not  taken  into  account  by  vk . 
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Since  f k  are  unknown,  we  use  approximations  of  them,  either  given  at  the  outset  or 
determined  from  data.  For  each  k ,  fix  a  mapping  fk  that  maps  the  observation  sequence 
zk  =  (z0,...,zk)  into  the  space  of  functions  from  X  into  R" .  Then  the  collection  {Jk } 

A 

is  a  subsidiary  filter  that  determines  an  approximation  f k  =  (z  )  of  fk  for  each  k  .  On 

the  other  hand,  since  h  cannot  be  directly  used,  we  construct  a  sampled  version  of  it  as 
follows.  Choose  a  finite  set  D  whose  points  form  a  uniform  grid  on  X .  Then  D  defines 
a  finite  number  of  closed  boxes  R} ,  ...,  Rp  of  the  same  size  such  that  R,  n  D  is  the  set 

of  vertices  of  R,  for  each  i .  By  offline  model  simulation,  we  obtain  a  sample  3~Cn  of  h 
given  by  3f  n  =  {{d,  h{d))  :deD}. 

Now,  our  filtering  problem  is  as  follows:  given  {J'k }  and  3f D,  get  an  estimate  xk]k  of 

xk  ( and  a  prediction  xk+m  of  xA+l )  based  on  zk  for  each  k.  Any  technique  tackling  this 

problem  should  be  capable  of  dealing  with  unknown  dynamics  and  computationally 
complex  observation. 

4.1.4  Proposed  Filtering  Algorithm 

We  propose  that  the  filtering  algorithm  consist  of  the  following  steps  at  each  time  k :  to 

A 

derive  an  estimate  yk  of  the  model  output  yk ;  to  find  a  function  h  ,  which  contains  yk 
in  its  domain,  such  that  h~{ (yk)  e  h~] ({yk})  whenever  yk  e  h(X) ,  and  to  evaluate 
h  (yk);  to  obtain  an  estimate  xk]k  of  xk  by  compensating  for  the  systematic  error  v* 
from  to  make  a  prediction  xk+vk  of  xk+{ .  The  following  algorithm  has  this 

structure.  (For  S  c  R" ,  denoted  by  conv^S)  is  the  convex  hull  of  S .) 

Algorithm  A. 

Step  0.  Fix  boxes  W ,  FcR";  make  a  prediction  Jc0|_,  of  x0 ;  let  X  =  x0|_,  +  W  ;  set 
k  =  0 . 

Step  1.  Get  zk ;  determine  fk  from  zk . 
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Step  2.  Let  Dk  =  (Xk  +  V)nD. 

Step  3.  Compute  yk  =  arg  min{||  y-zk\\:ye  con \(h(Dk ))} ,  and  obtain  {Xd such 
,hat  I„„,  2,  =  1 ,  2.,  >  0 ,  and  j),  =  ^  ■ 

Step  A.  Let  h^(yk)  2,  and  compute 

min{||  x  -  x*|jM  ||:  x  e  r'  (j>* )  -  V} . 

Step  5.  Set  x,+1|*  =  /*  (x4|4 ) ,  and  XA+1  =  xkm  +  W  . 

Step  6.  Increment  k  to  k  + 1 ;  go  to  Step  1 , 

In  Algorithm  A,  Steps  1-4  correspond  to  the  measurement  update  step  of  nonlinear 
filtering:  the  set  Dk  determines  the  constraint  sets  for  the  optimization  steps  that  follow 
Step  2;  the  two  optimization  steps,  Step  3  and  Step  4,  are  coupled  through  the  choice  of 
h  and  evaluation  of  h  (yk ) ,  Step  5  is  the  time  update  step:  the  function  fk  is  used  to 

predict  next  state;  the  box  X k  is  evolved  into  XM ,  so  that  it  is  used  to  update  Dk  into 
Dk+]  at  time  k  + 1 . 

4.1.5  Experimental  Results 

Algorithm  A  has  enabled  us  to  achieve  the  first  reported  real-time  state  estimation  for 
sub-micron  patterned  wafers  during  plasma  etchingl6.  Assuming  that  the  etch  rate  is 

sufficiently  low,  let  fk  (x)  =  x  for  all  x ;  for  simplicity,  let  V  =  {0} .  Using  the  trapezoidal 

approximation  of  the  grating  profile  as  in  Figure  35,  and  choosing  46  wavelengths  from 
the  spectral  range  of  the  2CSR  outputs,  we  have  n  =  3  and  m-  92. 
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Figure  36  shows  a  typical  real-time  wafer  state  estimation  result  obtained  using 
Algorithm  A  during  an  RIE  process  for  a  photoresist  grating.  For  reference,  marked  by 
x’s  are  the  offline  nonlinear  least  squares  fits.  For  boxes  ScR",  denote  the  j -th  side- 
length  of  S  by  len .  (S') ,  /  =  1 ,  2 ,  3 ;  let  len  j  (R}  )  =  lr  Four  different  sets  of  wafer  state 

estimates  are  produced  with  four  different  choices  of  W :  (a)  len ;  (IF)  =  3/;  (solid  lines); 

(b)  len/(IF)  =  2l/  (dotted  lines);  (c)  len (IF)  =  41  j  (dash-dotted  lines);  (d) 

len .  (IF)  =  51  j  (dashed  lines).  Among  (a)-(d),  choice  (a)  is  the  only  one  that  is  within  the 

range  of  W  expected  to  yield  good  wafer  state  estimates  according  to  the  analysis  result 
we  will  give  in  Section  6.  Indeed,  the  estimates  produced  by  choice  (a)  are  the  closest  to 
the  nonlinear  least  squares  results — the  real-time  estimates  of  Thickness,  Top  Width,  and 
Wall  Angle  given  by  the  solid  lines  are  uniformly  within  5  ran  (1  %),  5  nm  (3  %),  and 
0.5°  (0.5  %),  respectively,  of  the  least  squares  fits. 
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Figure  37  Left  (a)  is  a  sample  cross-section  and  trapezoidal  fit  before  the  etch.  Right  (b)  is  the 
cross-section  after  the  trim-back  etch  with  the  before  and  after  trapezoidal  fits. 

Algorithm  A  has  also  enabled  us  to  achieve  successful  end  point  detection  results16,17. 
The  cross-section  scanning  electron  microscope  (SEM)  photos  demonstrating  such  a 
result  is  shown  in  Figure  37.  The  bottom  width  of  the  photoresist  grating  is  considered  to 
be  the  critical  dimension.  The  bottom  width  of  250  nm  is  targeted,  and  an  RTF,  process  for 
a  photoresist  grating  is  terminated  at  time  k  once  the  real-time  wafer  state  estimate  xklk 

leads  to  a  bottom  width  less  than  or  equal  to  the  target  bottom  width.  Figure  37(a)  shows 
the  SEM  photo  from  a  part  of  the  sample  before  the  RIE  process  and  the  estimated  profile 
at  the  beginning  of  the  etching.  Figure  37(b)  shows  the  SEM  photo  after  the  RIE  process 
and  the  estimated  grating  profile  from  Algorithm  A  at  the  end  of  etching.  In  this  example, 
the  end  point  estimates  of  the  bottom  width  from  Algorithm  A  and  nonlinear  least  squares 
model  fitting  are  249  nm  and  251  nm,  respectively.  They  are  in  good  agreement  with  the 
SEM  photo  although  there  are  some  ghost  images  and  shining  lines  due  to  the  space 
charging  effects  of  the  SEM. 


4.1.6  Analysis  Results 

A  generalized  version  of  Algorithm  A  has  been  analyzed  in  [l8]  and  shown  to  be  stable — 

that  is,  to  produce  state  estimates  with  bounded  error — under  certain  conditions.  In 

particular,  it  can  be  shown  that  Algorithm  A  is  stable  under  the  following  assumption: 

Assumption  A. 

(a)  The  state  space  X  is  compact;  there  are  Ak  >  0  such  that 
II  fk  ( x )  ~  fk  (^)  II-  K  II  x  -  x  I|  for  all  k  and  all  x,  x  e  X . 

(b)  There  are  sk  >  0 ,  and  compact  X0 ,  Wk,  Vk  c:  R"  such  that  x0eX0,  wk  eWk, 
vk  eVk,  and  ||  ek  ||<  ek  for  all  k  . 

(c)  For  each  Rn  i  e  {1, ...,  p) ,  there  exist  e  R"'x"  and  ai  e  R'"  such  that 
h(x)  -  Ai  x  +  a,  for  xe  Rr 

(d)  Let  p  be  the  largest  integer  satisfying  the  following:  whenever  f, ,  ..., 
ip  e  {1, ...,  p]  are  such  that  (Jf=1  Rtj  is  connected  and  {Ah,...,AiJ  =  {Ai  ,...,AiJ, 
q<  p,  with  A^  *  Ai(  for  j  <k<  p,  the  matrix  [Ah  •  ■  ■  T;.  ]  e  R'"x,/"  has  foil  rank. 
Then  p>  3". 

(e)  There  exist  compact  Fk  e  R"  such  that  fk  (x)  -  fk  (x)  e  Fk  for  rel. 

(f)  For  simplicity,  let  Vk  ~  V  =  {0} ;  for  q  <  p,  let  be  the  set  of  matrices 

[ 4,  4  ]  where  i(/  are  as  in  (d);  let  Mp  be  the  set  of  Z'/=i  Bk  over  q  <  p, 

JA&JAq,di nd  (A1  A)~'  A1  =  [B(  5'  ]7 ;  let  p  =  max{\\  B\\:  B  &  J4p} .  Then  the 

sets  -x01_,  +X0  and  Ak  Ball (psk)  +  Fk  +Wk  are  contained  in  the  box  centered  at 

the  origin  and  of  the  same  size  as  R{ .  (For  s  >  0 ,  Ball(£-)  denotes  the  closed  e  -ball 
centered  at  the  origin.) 

(g)  The  box  W  is  centered  at  the  origin  with  3  /y  <  len  j  (W)  <41  ^  j  =  1 ,  2  ,  3  . 

Assumption  A(a)  says  that  the  functions  fk  are  Lipschitz  continuous  on  the  compact 

state  space,  and  Assumption  A(b)  says  that  the  disturbances  are  all  bounded.  The  function 
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h  is  piecewise  linear  and  continuous,  and  each  piece  of  h  is  defined  by  Assumption 
A(e);  that  is,  we  replace  the  “true”  h  with  a  piecewise  approximation  of  it,  and  let  any 
error  incurred  in  doing  so  be  absorbed  in  either  vk  or  ek .  We  call  the  number  p  defined 
in  Assumption  A(d)  the  “redundancy  number”  of  h ;  p  >  0  implies  that  at  least  each 
piece  of  h  is  one-to-one,  and  p  >  3"  implies  that  the  inequality  m>n3"  is  likely  to  hold 
(as  long  as  h  is  not  intentionally  made  to  be  very  simple).  Assumption  A(e)  says  that  fk 
are  approximations  of  fk  with  bounded  error,  and  Assumption  A(f)  says  that  the 
underlying  disturbances  and  errors  are  small.  Finally,  Assumption  A(g)  says  that  the  box 
W  is  sufficiently  large  and  sufficiently  small  at  the  same  time. 

Lemma  A. 

If  Assumption  A(c)  holds  with  p>  0,  and  if  ip  e  {1, ...,  p)  are  such  that 

U  =  U^i  Rj.  is  connected,  then  there  exist  B  e  R"*"'  and  b  e  R"  such  that  the  function 

<f> :  conv(/z(f/))  — >  R"  defined  by  </>(y)  =  By  +  b  for  y  e  con v(h(U))  is  a  left  inverse  of 
h\U \  that  is,  x  =  (p(h(x))  for  rel/.  Furthermore,  for  any  convex  combination 
S;  A,  Mdj),  dj  eU  ,  we  have  E;.  dJ  =  Xj  h(d  t)) , 

According  to  Lemma  A,  inversion  of  h  is  easy  if  h  is  piecewise  linear  with  sufficiently 
large  redundancy  number  p .  Algorithm  A  exploits  this,  and  produces  state  estimates 
with  bounded  error  if  Assumption  A  holds — that  is,  provided  that  the  observations  are 
redundant  with  sufficiently  large  redundancy  number,  and  that  the  disturbances  are  small 
and  bounded: 

Theorem  A. 

If  Assumption  A  holds,  then  there  exists  a  p  >  0  such  that  ||  xk  -  xk]k  ||<  psk  for  all  k  . 

Algorithm  A  has  a  certain  optimality  property.  That  is.  Algorithm  A  reduces  to  the 
maximum  a  posteriori  probability  (MAP)  estimator  under  an  ideal  case. 

Assumption  B. 


46 


(a)  The  output  disturbance  terms  ek  are  identically  zero;  that  is,  ek  =  0  for  all  £  . 

(b)  The  disturbances  x0,  w0,  i>0,  w, ,  v, ,  are  all  independent,  and  uniformly 
distributed  on  boxes  X0,  WQ,  VQ,  Wt,  Vn  ..^respectively. 

(c)  There  are  vectors  uk  e  R"  and  diagonal  matrices  Fk  e  RHX"' ,  with  - 1  <  Fk  <  I , 
such  that  fk  (x)  =  Fkx  +  uk  for  xe  X . 

(d)  The  function  h  is  one-to-one  with  a  left  inverse  h~l . 

(e)  For  all  k ,  we  have  Vk  -  Vk  e  Wk . 

In  the  ideal  case  where  Assumption  B  holds,  there  exists  a  simple  algorithm  that  generate 
MAP  estimates: 

Theorem  B. 

Suppose  that  Assumption  B  holds.  Choose  x0|0  e  X0  n  ( h (z0)-  F0)  and  let 

**+l|*  =  fk  (*k\k  )  > 

*k+nM  =  arg  min{||  x  -  x*+p  ||:  x  e  h~x  ( zM  )  -  VM  } 
for  k  =  0,  1,  ....  Then  we  have  pk[k  (xk[k  |  zk)>  pk{k(x\zk)  for  all  xeX  and  k.  (For 
xel,  pk]k  (x  |  zk )  denotes  the  conditional  probability  density  of  xk  =  x  given  zk .) 

The  optimization  step  in  Theorem  B  is  very  similar  to  that  in  Step  4  of  Algorithm  A.  In 
fact,  Theorem  B  implies  the  following:  if  Assumptions  A  and  B  both  hold,  and  if  fk  =  fk 
for  all  k ,  then  Algorithm  A  is  the  MAP  estimator. 

4. 1 .7  Angle  of  Incidence-Free  Real-Time  Wafer  State  Estimation 

Like  many  optical  in  situ  sensors,  the  2CSR  uses  a  fixed  angle  of  incidence  (AOI),  and 
the  RCWA  model  is  obtained  under  the  assumption  that  the  exact  value  of  the  true  AOI  is 
known.  However,  it  is  found  that  small  deviation  of  the  assumed  AOI  from  the  true  AOI 
can  cause  considerable  error  in  real-time  wafer  state  estimates,  as  well  as  least  squares 
model  fitting  results.  Thus,  it  is  desirable  to  develop  an  AOI-firee  version  of  Algorithm  A. 
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Let  the  true  AOI  be  another  state  variable,  denoted  by  8k  at  time  k ,  whose  evolution  is 
given  by 

ei+x=0k,k  =  0,1,.... 

Let  0  be  the  space  of  true  angles  of  incidence.  Then  we  have  a  function 
h  :Ix0->  R"'  that  represents  the  extended  RCWA  model  such  that 
zk  =h(xk  +  vk,8k)  +  ek ,  k=  0,  1,.... 

Choose  a  finite  set  T  whose  points  form  a  uniform  grid  on  0  .  Then  by  offline  RCWA 
simulation,  we  obtain  a  sample  of  h  given  by 

d*t  =  {((d ,  8),  h(d ,  8)) :  d  e  D  and  8  eT) .  Now,  the  AOI-free  filtering  problem  is  as 

follows:  given  {fk}  and  8fl)xr,  get  an  estimate  (xk]k,8k[k)  of  (xk,8k)  based  on  zk  for 
each  k . 


An  analysis  result  in  Section  6  indicates  that,  to  have  good  state  estimates,  we  may 
require  m  >  («  +  l)3"+l ,  which  is  violated  when  n  —  3  and  m  =  9 2 .  In  fact,  experiments 
show  that  direct  use  of  Algorithm  A  for  the  AOI-free  filtering  problem  does  not  yield 
satisfactory  results,  so  we  slightly  modify  Algorithm  A:  at  time  k ,  obtain  an  intermediate 
estimate  xk]k  of  (xk ,  8k )  from  Step  4  of  Algorithm  A;  if  xk{k  =(dl,...,dn,  dll+] ) ,  then  set 

**i*  ={dx,...,dn)  and 


8 , 


k\k 


/k\k- 


+  - 


d„+\  dk\k~\ 

1  +  T  +  ---  +  T* 


for  some  X  e  [0, 1] ,  where  0k+m  =  8 
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Figure  38 

The  above  AOI-free  algorithm  has  been  shown  to  be  capable  of  accurately  estimating  the 
wafer  state, in  real  time  without  knowing,  or  assuming,  the  true  AOI  value19.  Figure  38 
shows  a  typical  result.  The  experimental  data  used  in  Fig.  2  are  known  to  come  from  a 
true  AOI »  73°,  so  we  re-used  the  data  here.  In  the  figure,  the  AOI  estimates  are  shown  to 
converge  to  (a  small  neighborhood  of)  the  true  AOI,  and  the  resulting  AOI-free  wafer 
state  estimates  (solid  lines)  become  virtually  the  same  as  the  fixed-AOI  estimates  with 
assumed  AOI=  73°  (dashed  lines)  once  the  AOI  estimates  are  sufficiently  close  to  the 
true  AOI. 


4.1.8  Future  Work 

Next  direction  for  extending  our  experimental  result  on  the  real-time  state  estimation  for 
patterned  wafers  is  to  develop  a  more  general  model  of  patterned  structures  and  analyze 
the  corresponding  optical  sensor  measurements.  Analysis  indicates  that  the  proposed 
filtering  algorithm  can  be  applied  to  other  systems  with  redundant  observations  and 
bounded  disturbances — examples  include  power  systems  with  redundant  meter  readings, 
and  communication  systems  with  redundant  data. 
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4.2  Design  of  Experiments 

4.2.1  Process  and  Yield  Improvement  Through  Spatial  Modeling  of  Defect 
Clustering: 

Methods  for  quality  control  and  yield  improvement  in  the  IC  fabrication  industry  have 
traditionally  relied  on  overall  summary  measures  such  as  lot  or  wafer  yield.  These 
measures  are  appropriate  if  the  defective  ICs  are  distributed  randomly  both  within  and 
across  wafers  in  a  lot.  In  practice,  however,  the  defects  often  occur  in  clusters  or  display 
other  systematic  spatial  patterns.  These  spatially  clustered  defects  often  have  assignable 
causes  due  to  specific  process  or  equipment  problems.  Vijay  Nair  has  collaborated  with 
scientists  and  engineers  from  Bell  Labs  and  Lucent  Microelectronics  to  develop  statistical 
methodology  that  exploits  spatial  information  in  wafer  map  data  for  process  and  yield 
improvement  in  IC  fabrication. 

These  methods  have  been  deployed  extensively  within  Lucent  Microelectronics  and  have 
led  to  considerable  benefits  in  process  and  yield  improvements.  Presentations  of  the 
results  have  been  given  at  various  conferences  including  a  workshop  at  the  SEMATECH 
symposium  in  Austin,  TX  in  May  1999. 
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4.2.1. 1  Monitoring  of  Spatial  Processes  in  Semiconductor  Manufacturing: 

This  research  developed  novel  statistical  process  control  (SPC)  methods  to  automatically 
detect  the  presence  of  severe  spatial  clustering  of  defects.  A  family  of  control  charting 
procedures  was  developed  and  its  properties  studies  under  a  Markov  random  field  model 
of  mild  spatial  clustering  as  well  as  under  different  patterns  of  large-scale  clustering. 
Based  on  this,  a  specific  method  was  proposed  and  its  usefulness  demonstrated  through 
applications  to  real  data. 

4.2.1.2  Process  Improvement  Through  the  Analysis  of  Spatially  Clustered  Defects  on 

Wafer  Maps: 

This  research  develops  an  overall  strategy  for  yield  improvement  in  integrated  circuit 
fabrication  by  exploiting  important  spatial  information  in  wafer  map  data.  Both 
visualization  tools  and  flexible  methods  of  analysis  are  developed  to  analyze  the  high¬ 
dimensional  and  highly  structured  data.  Algorithms  for  identifying  spatial  patterns  at  the 
wafer  and  lot  level  are  developed.  These  are  used  to  develop  failure  diagnostics  and  the 
spatial  signatures  are  related  to  potential  manufacturing  problems  in  order  to  improve  the 
manufacturing  process. 

4.2.1.3  Yield  Modeling  in  IC  Fabrication: 

There  has  been  tremendous  interest  in  developing  yield  models  as  a  function  of  device 
size  and  complexity  so  that  reliable  capacity  and  cost  estimated  can  be  obtained.  Most  of 
the  research  in  the  literature  have  focused  on  developing  statistical  distributions,  such  as 
the  negative  binomial,  that  would  better  fit  the  marginal  yield  data.  Such  models  miss 
critical  information  available  in  the  actual  spatial  distributions  of  defects.  This  research 
developed  model-free  methods  for  estimating  some  of  the  commonly  used  yield  metrics 
that  are  used  to  track  IC  fabrication  processes.  These  methods  are  shown  to  be  superior  to 
the  time-honored  windowing  techniques  that  are  used  extensively. 

4.2.1.4  Spatial  Mixture  Modeling  and  Cluster  Detection: 

This  research  develops  a  new  methodology  which  models  the  wafer  map  data  in  IC 
fabrication  processes  as  mixtures  of  spatially  homogeneous  Markov  random  fields.  A 
Bayesian  hierarchical  model  and  Markov  Chain  Monte  Carlo  (MCMC)  methods  with 
Gibbs  sampling  are  used  to  estimate  the  underlying  model  parameters.  The  ICM  (iterative 
conditional  mode)  algorithm  provides  a  quick  way  to  “recover”  the  spatial 
clustering  pattern.  The  techniques  are  illustrated  on  binary  probe  yield  data  and  count 
data  on  particulate  defects  in  IC  fabrication.  This  work  was  part  of  Li-an  Xu’s  Ph.  D. 
thesis  which  was  completed  in  May,  2000. 

4.2.2  Optimal  Design  of  Experiments  for  Modeling  Processes  with  Feedback 
Variables: 


To  design  a  feedback  control  system,  one  has  to  first  develop  a  model  for  the 
relationships  among  the  input  variables,  potential  feedback  variables,  and  the  output 
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variables  and  then  select  the  appropriate  variables  for  feedback  control.  This  is  typically 
done  through  statistically  designed  experiments  where  the  input  variables  are 
systematically  varied,  and  the  experiment  is  run  in  an  initial,  open-loop  mode  without 
feedback  control.  Vijay  Nair  and  Li-an  Xu  have  studied  the  optimal  design  of 
experiments  for  modeling  such  a  process.  The  paper  considers  a  general  statistical 
formulation  of  the  problem  and  studies  the  properties  of  optimal  designs  in  the  linear 
case.  Locally  optimal  designs  under  the  D-optimality  criteria  are  studied  in  detail,  while 
results  for  A-  optimality  and  Bayesian  optimal  designs  are  also  developed.  The  results  are 
used  to  characterize  the  potential  loss  in  efficiency,  under  various  situations,  in  using  the 
classical  designs  for  this  problem. 

This  work  was  part  of  Li-an  Xu’s  Ph.  D.  dissertation.  This  research  is  related  to  the  Ph. 
D.  thesis  work  of  Oliver  Patterson,  done  under  the  supervision  of  P.  Khargonekar. 
Patterson’s  thesis  examined  methods  for  analyzing  data  from  the  experiment  in  order  to 
select  a  suitable  combination  of  variables  for  feedback  control  while  this  work  studied 
methods  for  designing  the  experiment.  The  paper  has  been  accepted  for  publication  in  the 
Journal  of  Statistical  Planning  and  Inference.  The  research  has  been  presented  as  invited 
papers  at  the  Spring  Research  Conference  on  Statistics  in  Industry  and  Technology, 
Minneapolis  in  June,  1999  and  at  the  Annual  Statistics  Society  of  Canada  Meeting  in 
Regina,  June  1999. 

4.2.3  PID  Controllers  and  SPC  for  Auto  correlated  Data: 

4.2.3. 1  Efficiency  and  Robustness  of  PI  Controllers: 


Feedback  control  schemes  have  been  in  wide  use  in  process  industries  for  many  years. 
They  are  also  being  used  increasingly  in  the  discrete-parts  manufacturing  industry 
recently.  This  paper  studies  the  efficiency  and  robustness  properties  of  discrete  PI 
schemes  under  some  commonly  encountered  situations.  For  process  disturbance,  we 
consider  the  stationary  ARM A(  1,1)  model  and  the  nonstationary  ARIMA(  1,1,1)  model. 
Process  dynamics  is  studied  under  a  first-order  dynamic  model,  including  the  special  case 
of  pure  gain.  The  efficiency  of  PI  schemes  is  compared  with  that  of  minimum  mean 
squared  error  (MMSE)  schemes  under  these  models.  The  PI  schemes  are  seen  to  be  quite 
efficient  over  a  broad  range  of  the  parameter  space.  Further,  the  PI  schemes  are  much 
more  robust  than  MMSE  schemes  to  model  misspeeifications,  especially  the  presence  of 
first-order  non-stationarity.  The  results  here  provide  additional  justification  for  the  use  of 
discrete  PI  schemes.  This  is  joint  work  between  V.  Nair,  F.  Tsung  (Hong  Kong),  and  H. 
Wu  (Iowa  State). 

4.2,3.2  PID-based  Control  Charts  for  Process  Monitoring: 


Using  the  connection  between  a  PID-control  scheme  and  the  corresponding  predictor,  this 
research  develops  a  class  of  control  chart  procedures  for  process  monitoring,  especially 
for  use  with  auto-correlated  processes.  This  PID-based  scheme  includes  as  special  cases 
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several  well-known  and  popular  techniques,  such  as  the  EWMA  and  EWMAST  (P- 
based)  and  Montgomery  and  Mastrangelo  (I-based)  charts.  The  performance  of  the 
procedures  is  studied  under  mean  shifts  and  various  auto-correlation  structures.  It  is 
shown  that  performance  within  this  class  of  procedures  can  be  optimized  by  tuning  the 
chart  parameters  through  appropriately  defined  capability  indices.  Examples  are  given  to 
illustrate  the  design  and  performance  of  the  procedures.  This  is  joint  work  with  W.  Jiang 
and  F.  Tsung  of  Hong  Kong  University  of  Science  and  Technology,  H.  Wu  of  Iowa  State 
University,  and  K.  L.  Tsui  of  Georgia  Tech. 
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5  Process  and  Materials  Research 


In  this  area,  there  were  3  major  accomplishments  achieved  under  this  program: 

1 .  Development  of  a  novel  ion-beam  modification  process  for  the  deposition  of  A1 
films  which  are  more  resistant  to  grain-growth. 

2.  Development  of  improved  plasma  deposition  processes  for  the  manufacture  of 
high  performance  AMLCDs. 

3.  The  development  of  a  new  measurement  technique  for  infrared  measurement  of 
thin  film  properties  without  substrate  interference. 

These  results  have  been  covered  in  detailed  in  prior  annual  reports  in  this  program. 
During  the  final  phase  of  this  program,  funding  in  these  areas  was  curtailed  and  no 
significant  new  result  were  produced  after  the  year  2000  annual  report. 
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48.  “Electrical  Instability  in  High-Rate  Deposited  Hydrogenated  Amorphous  Silicon  Thin 
Films”,  Tong  Li,  Chun-Ying  Chen,  Charles  T.  Malone,  and  Jerzy  Kanicki,  presented 
at  Materials  Research  Society  Fall  Meeting,  December  2-6,  1996  in  Boston,  MA. 

49.  “Longitudinal  Vibrational  Absorption  Modes  of  Hydrogenated  Amorphous  Silicon 
Nitride  Thin  Films”,  T.  Li,  and  J.  Kanicki,  presented  at  Materials  Research  Society 
Spring  Meeting,  April  13-17,  1998  in  San  Francisco,  CA. 
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8  Industrial  Interactions  and  Transfer 


8.1. 1.1  NIST -ATP  Program 


A  NIST  Advanced  Technology  Program  for  the  Intelligent  Control  of  the  Semiconductor 
Patterning  Process  was  started  in  part  due  to  developments  under  this  MURI  Center.  The 
goal  of  this  program  was  to  improve  the  control  of  the  definition  of  the  critical  MOS 
transistor  gate  dimensions  through  sensor-based  control  of  both  the  photolithography  and 
reactive  ion  etching  steps.  The  program  was  originally  lead  by  National  Semiconductor 
Corporation  and  the  additional  patterns  are  KLA-Tencor,  Lam  Research,  FSI 
International,  the  University  of  Michigan,  and  Stanford  University.  Major  subcontractors 
include  the  University  of  California-Berkeley,  the  University  of  California-Irvine,  and 
Cymer  Laser  Corporation.  National  Semiconductor  left  the  program  due  to  major 
restructuring  of  that  company  and  KLA-Tencor  assumed  the  lead  partner  position.  The 
research  leader  of  the  program  was  Dr.  Matt  Hankison  (KLA-Tencor).  This  program 
significantly  aided  in  the  effective  transfer  of  some  of  our  advanced  control  and  sensing 
research  to  these  industrial  partners  and  some  of  their  related  suppliers. 

8.1. 1.2  Lucent  Technologies  -  Bell  Labs 


Vijay  Nair  has  collaborated  with  researchers  at  Bell  Laboratories  and  process/product 
engineers  at  Lucent. 

8.1. 1.3  Plasma  Modeling 


The  UI  has  a  vigorous  program  of  industrial  interaction  and  technology  transfer  centered 
about  our  activities  in  plasma  equipment  modeling.  Our  research  tasks  on  this  MURI 
grant  are  able  to  leverage  this  existing  infrastructure.  The  HPEM  has  been  transferred  to 
12  industrial  partners  and  2  national  laboratories,  and  updates  of  the  HPEM  based  on 
MURI  activities  have  been  made  available  to  all  users. 

Our  industrial  interactions  relevant  to  the  MURI  tasks  are  as  follows. 


Company 

Point  of  Contact 

Leveraged  Funding  Source 

AMD 

Zoran  Krivokapic 
zoran@grape .  amd.  com 

SRC,  NSF 

Applied  Materials 

Dimitris  Lymberopoulos 

Dimitris_Lymberopoulos@ama 

t.com 

Applied  Materials,  SRC 

62 


LAM  Research 


Tom  Ni 

tom.ni@lamrc.com 


LAM  Research,  SRC 


Motorola 


LSI  Logic 

Texas  Instruments 


Michael  Hartig 
mikeh@giedi.sps.mot.com 

Valery  Sukhare 
vsukhare@lsil.com 

Bill  Dostalik 
dostalik@spdc.ti.com 


SRC,  NSF 
SRC,  NSF 
SRC,  NSF 
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9  Honors  and  Awards 


Fellows 

IEEE  Fellows  -  J.  W.  Grizzle,  P.  P.  Khargonekar,  M.  J.  Kushner 

American  Statistical  Association  Fellows  -  V.  N.  Nair  and  C.  F.  J.  Wu 

American  Physics  Society  Fellow  -  M.  J.  Kushner 

Optical  Society  of  America  Fellow  -  M.  J.  Kushner 

Institute  of  Mathematical  Statistics  Fellow  -  V.  Nair  and  C.  F.  J,  Wu 

SEMI  Outstanding  Achievement  Award  -  J.  Moyne 

American  Society  of  Materials  Fellows  -  D.  Srolovitz  and  G.  Was 

NACE  Fellow  -  G.  Was 

Best  Paper  Awards: 

1.  D.  Srolovitz,  American  Institute  of  Chemical  Engineering  Annual  Meeting,  1997. 

2.  T.  L.  Vincent,  Best  Paper  Presentation  in  Session,  American  Control  Conference, 
1998. 

3.  Hsu-Ting  Huang,  Ji-Woong  Lee,  Brooke  S.  Stutzman,  Pete  Klimecky,  Craig 
Garvin,  Pramod  P.  Khargonekar,  and  Fred  L.  Terry,  Jr.,  “Real  Time  In  Situ 
Monitoring  of  Deep  Sub- pm  Topography  Evolution  during  Reaction  Ion 
Etching,”  SEMATECH  AEC/APC  Symposium,  Lake  Tahoe,  NV.,  September  25- 
28,  2000.  (One  of  4  best  student  paper  awards  at  this  conference) 

4.  P.  Klimecky,  C.  Garvin,  “Plasma  Density  &  Resonant  Cavity  Modes  vs. 
Chamber  Condition  in  High  Density  RIE,”  SEMATECH  AEC/APC  Symposium, 
Lake  Tahoe,  NV.,  September  25-28,  2000.  (best  student  paper  award) 

5.  “/«  Situ  Monitoring  Of  Deep  Sub-pm  Topography  Evolution  And  Endpoint 
Detection  During  Reactive  Ion  Etching,”  Hsu-Ting  Huang,  Ji-Woong  Lee,  Pete 
Klimecky,  Pramod  P.  Khargonekar,  and  Fred  L.  Terry,  Jr.,  SEMATECH 
AEC/APC  Symposium  XIII,  October  6-11,  2001,  Banff,  Alberta,  Canada  (best 
student  paper  award  honorable  mention) 

6.  “Elimination  of  the  RIE  1st  Wafer  Effect:  Real-Time  Control  of  Plasma  Density,” 
Pete  I.  Klimecky,  Jessy  W.  Grizzle,  and  Fred  L.  Terry,  Jr.,  SEMATECH 
Advanced  Equipment  Control/ Advanced  Process  Control  Symposium,  Snow 
Bird,  Utah,  September,  2002.  (best  student  paper  award) 
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